Thanks for driving the proposal. I would like to share the related context that happened many years ago
- https://lists.apache.org/thread/y0r9kk0968ydpxtf16x6ql3x6kwy7dc1 - https://lists.apache.org/thread/hfv18cg0yckt5cqd0fc66rp7tth036kf We have two major approaches: 1. Minimize the persistent size of cursor data: • Example: PR:9292 and cursor data compression, possibly with a compressed bitset implementation (RoaringBitmap). 2. Split the ack cursor data into multiple chunks: • Example: PIP-81, PIP-381. LinLin and I previously worked on PIP-81. Personally, I am not a big fan of this solution. While working on PIP-81 and cursor data compression, we found that compression works well in most cases, even when there are millions or tens of millions of ack ranges. I recall we shared data on this before, though I can’t seem to find it now. >From a user perspective, most users are satisfied with the current solution, and only a few need compression enabled. The simplicity of the solution is vital for community users, which was the main reason we gave up on PIP-81 earlier. Pulsar is already complex, so having a pluggable solution for the long term would be more beneficial. This way, most users get a clear, simple version, while others needing enhanced solutions can create their plugins, managing the complexity themselves. I’m not going to block this proposal, but a few points need clarification: • Feature Toggle: Add a flag that allows users to enable this feature (keeping it disabled by default until there is higher demand). Managed ledger and cursor complexities are well-known, so a smooth opt-in process is crucial for users to adopt new features gradually. • Compatibility Concerns: Since the persistent data structure will change, we need to address rollback scenarios. For instance, if a user has 10MB of cursor data, upgrades to a new version with the PIP changes, and then needs to roll back to the older version, will that user lose their 10MB cursor data? What steps are required for a rollback to ensure data consistency? Regards, Penghui On Tue, Sep 24, 2024 at 1:42 AM Lari Hotari <lhot...@apache.org> wrote: > On Tue, 24 Sept 2024 at 05:01, Rajan Dhabalia <rdhaba...@apache.org> > wrote: > > However, there are multiple other PRs related to key-shared sub, stats, > > cursor performance, and other PRs are still blocked by others and people > > just block it because they think they don't have this usecase. It's so > > unfortunate that people easily merge implementations which only handle > > small-scale usecases but the usecases for which Pulsar was built are > > being blocked or take a long time to merge. It's just that I don't have > > that energy to keep following up for useful and important changes for > > Pulsar. And this is one of these examples as well. I have also started > > discussion about improving the PIP process because it has become painful > in > > many cases. > > It's not that individuals want to block changes for no reason. It > seems that the main reason for blocking changes is the fear of > regressions. Some areas of the Pulsar codebase aren't well covered in > our test suites. For example, we don't have performance tests as part > of the Apache Pulsar repositories. We have a lot of tests, but most of > them are written in a way that tests the code as the author expects it > to work. There are very few tests that evaluate features from the > end-user API perspective or as system tests. > > Writing new tests is slow, and the developer experience is poor with > the current test infrastructure. Adding more tests to the main build > would slow down Pulsar CI even more. This isn't a new problem; it's > been around for many years. I'd love to see more proposals and active > contributions to improve the "safety nets" of Apache Pulsar so that we > wouldn't fear change. I'm not saying that this is only a testing > problem. Testability impacts architecture too. Balancing all different > aspects of the system isn't easy, and it requires effort and > dedication. We don't currently have enough contributors who are > investing their time in enabling others to contribute effectively. I > hope that we can improve together and address the problems we have > that cause the fear of change. When that is addressed, there would be > more confidence in accepting new PIPs and changes even when the > reviewer doesn't have the use case or when they aren't familiar with > the problem that the PIP is targeting to solve. > > -Lari >