> The only sub-proposal I’m particularly unsure about is 17059, which doesn’t 
> seem to increase modularity at all. It looks to be a kind of plugin hook, and 
> IMO should definitely be addressed separately. Perhaps a simple DISCUSS 
> thread and its Jira will suffice?

Ok.  I will remove that one from the CEP to discuss separately.

> On Oct 26, 2021, at 2:32 PM, bened...@apache.org wrote:
> 
>> I'm not particularly sympathetic to the concerns about friction on making 
>> changes to internal API's since modern IDE tooling makes this a trivial 
>> exercise
> 
> We’re getting abstract here, so this isn’t a rebuttal or even tied to 
> strongly this particularly discussion, but to express my point more clearly.
> 
> We don’t abstract everything in the codebase, and in fact in general we (or 
> at least, I) try to keep things concrete as long as there’s no reason to 
> abstract them, because this is usually easier to reason about and lower 
> overhead to modify. This is true even on the single class level, so of course 
> it happens at the module level. This isn’t about the IDE refactoring, but the 
> cognitive burden of reasoning simultaneously about the concrete class and the 
> abstraction, and how they relate.
> 
> The problem with premature abstraction, and particularly when multiple 
> implementations start appearing, is that you have to start formalising the 
> abstractions in ways that permit you to reason only about the abstraction. 
> This necessarily means eschewing some knowledge of how the concrete 
> implementation(s) work. This may prevent very useful simplifications for how 
> you interact with a specific concrete implementation, as we have to code to 
> the API. This may prevent optimisations. This may also introduce additional 
> complexity when either implementing the abstraction or when reasoning about 
> the actions you are performing against it, where often you may not entirely 
> ignore the concrete implementation (due to imperfect or ambiguous API 
> specifications), so you must now consider if you are compatible with both the 
> abstraction and any known concrete implementations.
> 
> These are all additional burdens, but we often pay the cost for perceived 
> benefits.
> 
> It seems to me though that this discussion is conflating 
> modularisation/pluggability with decoupling, which is a benefit we might gain 
> in return for these additional costs. To me this is a distinct problem, 
> however. It’s quite possible to modularise and yet tightly couple, though 
> usually it will break tight coupling. But breaking tight coupling doesn’t 
> require modularisation, and certainly doesn’t require pluggability.
> 
> To bring it back to this discussion, the intent of a piece of work always 
> drives the outcome, and in my opinion it is best to always consider a work in 
> its actual context. The primary purpose of this work is pluggability, and so 
> this will inform the API modifications. A straightforward goal of reducing 
> tight coupling in the codebase would likely approach this problem 
> differently. None of this is a bad thing, just in my opinion the nature of 
> development.
> 
> That said, I’m broadly happy to see this work go ahead. I would prefer to 
> split the conversations out into their driving projects for the 
> aforementioned reasons, but I wouldn’t veto the proposal on that basis. It 
> would be nice to see others’ opinions about this.
> 
> The only sub-proposal I’m particularly unsure about is 17059, which doesn’t 
> seem to increase modularity at all. It looks to be a kind of plugin hook, and 
> IMO should definitely be addressed separately. Perhaps a simple DISCUSS 
> thread and its Jira will suffice?
> 
> 
> From: Joshua McKenzie <jmcken...@apache.org>
> Date: Tuesday, 26 October 2021 at 19:16
> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> Subject: Re: [DISCUSS] CEP-18: Improving Modularity
>> 
>> To me having some defined interfaces for interacting with different
>> sections of the code is a huge boon for improving developer productivity
>> going forward in the project.  Every place where we can reduce the amount
>> of code reaching inside another module to get at a random internal class is
>> a positive,
> 
> I've long been of the opinion that the benefits outweigh the costs of
> having clear interface points between major subsystems in a codebase. I'm
> not particularly sympathetic to the concerns about friction on making
> changes to internal API's since modern IDE tooling makes this a trivial
> exercise, however I _am_ quite sympathetic to the concerns about
> introducing friction against deeper integrations between subsystems.
> 
> That said, we have a history on the project of being somewhat hot and cold
> when it comes to our approach to performance testing; I think our low
> hanging fruit as a project revolves more around discipline and
> reproducibility on knowing where our performance is today and making
> changes with an eye to that rather than keeping open the flexibility of
> tightly coupling subsystems through their implementations.
> 
> With the modern runtime environment shifting so much toward
> containerization I can't help but think smaller, clearly modularized
> components are more resilient against a rapidly evolving runtime
> environment and more sympathetic to the constrained resource environments
> they run in, as well as more classically optimizable in their own right.
> 
> I air all this just to contribute perspective to the discussion; all that
> said, I think refactoring APIs as a pure reflection of what the DB is doing
> today just risks ossifying something that grew up organically and probably
> isn't going to do us any favors, so having a use-case (or better yet a few
> implementations) we're deriving an interface from, or targeting a more
> testable / mockable structure plus introducing those tests should give us
> guidance to improve the route we go.
> 
> ~Josh
> 
> 
> On Mon, Oct 25, 2021 at 4:22 PM Jeremiah D Jordan <jerem...@datastax.com>
> wrote:
> 
>> As Henrik said we have been refactoring access to these different internal
>> APIs as part of some larger work.  For this CEP we pulled together a bunch
>> of the smaller ones into one place, similar to the refactoring proposed in
>> CEP-10, as we felt doing many small CEPs, one per module, would be less
>> productive if there was support in the project in general for trying to
>> standardize access to different sections of the code and start creating a
>> more defined internal API.  If there is consensus that it would be better
>> to propose each change as its own CEP, or even just as single tickets
>> without a CEP for these internal refactors, we can do that as well.  The
>> CEP process is evolving as we go through these, so just trying to figure
>> out the best way forward.
>> 
>> The currently proposed changes in CEP-18 should all include improved test
>> coverage of the modules in question.  We have been developing them all with
>> a requirement that all changes have at least %80 code coverage from sonar
>> cloud jacoco reports.  We have also found and fixed some bugs in the
>> existing code during this development work.
>> 
>> To me having some defined interfaces for interacting with different
>> sections of the code is a huge boon for improving developer productivity
>> going forward in the project.  Every place where we can reduce the amount
>> of code reaching inside another module to get at a random internal class is
>> a positive, as it prevents unknown side effects when changing that module
>> when the person developing the new feature did not realize other parts of
>> the code were depending on some current internal behavior that was not
>> clearing part of the modules interface.
>> 
>> On the question of changing internal interfaces that I have seen in some
>> other venues, I do not think creating such interfaces should prevent us
>> from changing them as needed for future work.  I think having the
>> interfaces actually improves on our ability to do so without breaking other
>> parts of the code.  My suggestion would be that we try not to make such
>> changes in patch releases if possible, but again I wouldn’t let that hold
>> anything back.
>> 
>> So do people feel we should re-propose these as multiple CEP’s or just
>> tickets?  Or do people prefer to have a discussion/vote on the idea of
>> improving the modularity of the code base in general?
>> 
>> -Jeremiah
>> 
>>> On Oct 25, 2021, at 9:26 AM, bened...@apache.org wrote:
>>> 
>>> Thanks Henrik for the additional context.
>>> 
>>> I’m not personally a fan of modularity only for modularity’s sake.
>> Everything in software is a balancing act of competing priorities, and
>> while pluggability supports certain use cases it can slow down development
>> or prevent deeper integrations by preventing assumptions about how systems
>> operate.
>>> 
>>> To be clear, I’m fully in favour of helping to enable your use cases, I
>> just think it is important to make a decision for each refactor based on
>> the merits and goals in question. If the justification is improved testing,
>> then testing should be a core goal of the CEP. If it’s enabling a feature
>> to be upstreamed later, I personally would prefer to tie the refactors to
>> those features – which I hope will all find broad support for inclusion;
>> certainly those I have heard of, I am eager to see arrive in Cassandra.
>>> 
>>> If the goal is to support entirely external features, we have to decide
>> what kind of support we offer to these APIs, and this probably needs to be
>> discussed on a per-API basis with the justification for pluggability
>> weighed against any constraints this imposes on development. The most
>> obvious example here is membership and schema, which I think is a primarily
>> to support an external dependency but we expect this area of the codebase
>> to be significantly revised over the coming months.
>>> 
>>> 
>>> From: Henrik Ingo <henrik.i...@datastax.com <mailto:
>> henrik.i...@datastax.com>>
>>> Date: Monday, 25 October 2021 at 14:52
>>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> <
>> dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
>>> Subject: Re: [DISCUSS] CEP-18: Improving Modularity
>>> Hi Benedict
>>> 
>>> This CEP is a bundle of APIs arising out of our recent work to
>> re-architect
>>> Cassandra into a more cloud native architecture. What our product
>> marketing
>>> has chosen to call "Serverless" is a variant of Cassandra where we have
>>> separated compute from storage (coordinator vs data node), used S3-like
>>> storage, and made various improvements to better support multi-tenancy
>> in a
>>> single Cassandra (Serverless) cluster. This whitepaper [1] explains this
>>> work in detail for those of you interested to learn more. (Apologies that
>>> it requires registration and the first page may at times sound a bit
>>> marketingy, but it's really the most detailed report we have published so
>>> far.)
>>> 
>>> [1] https://www.datastax.com/resources/whitepaper/astra-serverless
>>> 
>>> The above work was implemented in a way where by default a user can
>>> continue to run Cassandra in the familiar "classic" way. The APIs
>>> introduced by CEP-18 on the other hand allow alternate or additional
>>> functionality to be provided, which in our case we have used to create a
>>> "serverless" way of deploying a Cassandra cluster.
>>> 
>>> The logic behind proposing this bundle of APIs separately, is roughly for
>>> these reasons:
>>> 
>>> The APIs touch existing code and functionality, so to minimize risk to
>> the
>>> next Cassandra release, it would make sense to try to complete merging
>> this
>>> work as early as possible in the development cycle. For the same reason,
>>> keeping the new implementations out of this CEP allows us to focus
>> review -
>>> both of the CEP, and the eventual pull requests - on the APIs themselves,
>>> whereas the related implementations (or plug-ins) would add to the scope
>>> quite significantly. On the other hand non-default plugin functionality
>> can
>>> be added later with much lower risk.
>>> 
>>> Second, while it's completely fair to ask for context, why was this
>>> particular refactoring or API done in the first place, the assumption
>> for a
>>> CEP like this one is that better defined interfaces, that are better
>>> documented and come with better test coverage than existing code, should
>> be
>>> enough legs to stand on in itself. Also, in the best case a good API will
>>> also enable other implementations than the one we had in mind when
>>> developing the API, so we wouldn't want to tie the discussion too much
>> into
>>> the implementation that happened to be the first. (As an example of this
>>> working out nicely, your own work in CASSANDRA-16926 was for you
>> motivated
>>> by enabling a new kind of testing, but it also just so happens it is the
>>> same work that enables someone to implement remote file storage, which we
>>> therefore could drop from this CEP-18.)
>>> 
>>> Conversely also, it was our expectation when proposing this CEP that
>>> "better modularity" at least on a high level should be a fairly
>>> straightforward conversation, while the actual plugins that make up our
>>> "serverless" new architecture may reasonably ignite much more debate, or
>> at
>>> least questions as to how they work. As we have a backlog of several
>> fairly
>>> substantial CEPs lined up, we are trying to be very mindful of the
>>> bandwidth of the developers on this list. For example, last week Jacek
>> also
>>> proposed CEP-17 for discussion. So we are trying to focus the discussion
>> on
>>> what's in CEP-17 and CEP-18 for now. (In addition I remember at least 2
>>> CEPs that were discussed but not yet voted on. I don't know if this adds
>> to
>>> cognitive load for anyone else than myself.)
>>> 
>>> henrik
>>> 
>>> On Mon, Oct 25, 2021 at 12:39 PM bened...@apache.org <
>> bened...@apache.org>
>>> wrote:
>>> 
>>>> Hi Jeremiah,
>>>> 
>>>> My personal view is that work to modularise the codebase should be tied
>> to
>>>> specific use cases. If improved testing is the purpose of this work, I
>>>> think it would help to include those improved tests that you plan to
>>>> support as goals for the CEP.
>>>> 
>>>> If on the other hand some of this work is primarily intended to enable
>>>> certain features, I personally think it would be preferable to tie them
>> to
>>>> those features - perhaps with their own CEP?
>>>> 
>>>> 
>>>> From: Jeremiah Jordan <jeremiah.jor...@gmail.com>
>>>> Date: Friday, 22 October 2021 at 16:24
>>>> To: Cassandra DEV <dev@cassandra.apache.org>
>>>> Subject: [DISCUSS] CEP-18: Improving Modularity
>>>> Hi All,
>>>> As has been seen with the work already started in CEP-10, increasing the
>>>> modularity of our subsystems can improve their testability, and also the
>>>> ability to try new implementations without breaking things.
>>>> 
>>>> Our team has been working on doing this and CEP-18 has been created to
>>>> propose adding more modularity to a few different subsystems.
>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-18%3A+Improving+Modularity
>>>> 
>>>> CASSANDRA-17044 has already been created for Schema Storage changes
>> related
>>>> to this work and more JIRAs and PRs are to follow for the other
>> subsystems
>>>> proposed in the CEP.
>>>> 
>>>> Thanks,
>>>> -Jeremiah Jordan
>>>> 
>>> 
>>> 
>>> --
>>> 
>>> Henrik Ingo
>>> 
>>> +358 40 569 7354 <358405697354>
>>> 
>>> [image: Visit us online.] <https://www.datastax.com/>  [image: Visit us
>> on
>>> Twitter.] <https://twitter.com/DataStaxEng>  [image: Visit us on
>> YouTube.]
>>> <
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=
>>> 
>>> [image: Visit my LinkedIn profile.] <
>> https://urldefense.com/v3/__https://www.linkedin.com/in/heingo/__;!!PbtH5S7Ebw!MiGmtcfVF1M2qLDlD18xw2bDHMJqp1cPfnoa-7WDdoWmM26YYo2vM-znIXghiwXv$<https://urldefense.com/v3/__https:/www.linkedin.com/in/heingo/__;!!PbtH5S7Ebw!MiGmtcfVF1M2qLDlD18xw2bDHMJqp1cPfnoa-7WDdoWmM26YYo2vM-znIXghiwXv$>
>> <
>> https://urldefense.com/v3/__https://www.linkedin.com/in/heingo/__;!!PbtH5S7Ebw!MiGmtcfVF1M2qLDlD18xw2bDHMJqp1cPfnoa-7WDdoWmM26YYo2vM-znIXghiwXv$><https://urldefense.com/v3/__https:/www.linkedin.com/in/heingo/__;!!PbtH5S7Ebw!MiGmtcfVF1M2qLDlD18xw2bDHMJqp1cPfnoa-7WDdoWmM26YYo2vM-znIXghiwXv$%3e>
>>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to