> To enable a faster development cadence I am proposing the following steps > to reorg the current pulsar monorepo and test harness.
How is the current repo impeding development cadence? > Milestone 1 > > Move doc artifacts to an independent repo. Moving the documents to a different repo will actually have the opposite effect on development where if a dev updates something that needs a corresponding update to documentation, it will require two different PRs to two different repos. What will most likely happen is that the second PR will never happen, so documentation will lag more than it already does. Breaking the docs into a different repo will also break the 1-1 relationship between a set of docs and a release, which will have to be managed in some other way. > Milestone 2 > > Move connectors, external integration to another repo. > We need to set up we can call the new repo apache/pulsar-ext. This sounds like a good idea. The connectors interfaces should be stable enough that the repos can be uncoupled. How would this work for docs and release cycle though? Moving them out of the repo implies a separate release cycle. > This should be most of the connectors, and other integration jars just such > storm , flink and spark sources and sinks. The integration test suite needs > to be split aslo. Integration test suite for the connectors, or all the tests? If you mean all the tests, thats a big -1 from me. Tests should live with the code that they are testing. However, if just the connectors integration tests, then sure, they should live with the connectors. > This requires new release aggregator script to be be built, that can make a > release with artifacts generated from multiple repos. This will make releases more complicated than they already are, which seems counter to the originally stated goal. > Milestone 3 > > Move C++/python/go artifacts to an external repo. As with the docs, I worry that this would lead to c++/python/go lagging further behind. > Milestone 4 > > Rebuild the broker unit test suite. > The current broker test suite is quite brittle an unorganized. > about 20 percent of the tests can move to the integration suite about 20 > percent will stay about 60 percent need to rewritten to use a common that > can provide test isolation and parallelization via namespaces. Worth doing, but this is a very large task. My overall concern with this proposal is that it adds a lot of coordination overhead, without adding much benefit IMO towards developer cadence. Multiple repos work well when there are multiple teams and well defined interfaces between the teams (in terms of code and organization). I don't think that is the case right now in Pulsar, except for maybe the connectors. Otherwise, every developer is (or should be) touching each of the different parts mentioned here. -Ivan
