As requested, I'm alerting the ML that I've got the first patch of several to come. This one only addresses the ColumnFamilyStore class - and only changes the name. There's follow up tickets to change method and variable names. As we agreed, I'm doing this incrementally, so please let's keep that in mind, I have no interest in producing a 10K patch that changes all the things.
Since the patch only touches lines explicitly naming ColumnFamilyStore this patch should be reasonably non-intrusive. https://issues.apache.org/jira/browse/CASSANDRA-14340 Jon On Thu, Mar 22, 2018 at 8:09 AM Jon Haddad <j...@jonhaddad.com> wrote: > Cool. I think there’s general agreement that doing this in as small bites > as possible is going to be the best approach. I have no interest in mega > patches. > > > The combined approach takes a > > change that's already non-trivially dealing with complex subsystem > > changes and injects a bunch of trivial renaming noise across unrelated > > subsystems into the signal of an actual logic refactor. > > I agree. This is why I like the idea of proactively working to improve > the readability of the codebase as a specific goal, rather than being > wrapped into some other unrelated patch. Keeping the scope in check is the > challenge. Simple class and method renames, as several have pointed out, > is easy enough with IDEA. > > I’ll start with class renames, as individual patches for each of them. > I’ll be sure to call it out on the ML. First one will be ColumnFamilyStore > -> TableStore. > > Jon > > > On Mar 22, 2018, at 7:13 AM, Jason Brown <jasedbr...@gmail.com> wrote: > > > > Jon, > > > > Thanks for bringing up this topic. I'll admit that I've been around this > > code base for long enough, and have enough accumulated history, that I > > probably can't fully appreciate the impact for a newcomer wrt naming. > > However, as Josh points out, this situation probably happens to "every > > non-trivially aged code-base ever". > > > > One thing I'd like to add is that with these types of large refactoring > > changes, the review effort is non-trivial. This is because the review > still > > has to ensure that correctness is preserved and it's easy to overlook a > > seemingly innocuous change. > > > > That being said, I am supportive of this effort. However, I believe it's > > going to be best, for contributor and reviewer, to break it up into > > smaller, more digestible pieces. I'd also like to request that we not go > > whole hog and try to do everything in a compressed time frame; reviewer > > availability is already stretched thin and I'm afraid of deepening the > > review queue, especially mine :) > > > > Thanks, > > > > -Jason > > > > > > > > > > On Thu, Mar 22, 2018 at 6:41 AM, Josh McKenzie <jmcken...@apache.org> > wrote: > > > >>> Some of us have big patches in flight, things that actually > >>> pay off some technical debt, and dealing with such renames is rebase > >> hell :\ > >> For sure, but with a code-base this old / organically grown, I expect > >> this will always be the case. If we're talking something as simple as > >> an intellij rename refactor, while menial, couldn't someone with a > >> giant patch just do the same thing on their side and spend half an > >> hour of their life clicking next? ;) > >> > >>> That said, there is good time for such renames - it’s during > >>> those major refactors and rewrites. When you are > >>> changing a subsystem, might as well do the appropriate renames. > >> Does that hold true for a code-base with as much static state and > >> abstraction leaking / bad factoring as we have? (i.e. every > >> non-trivially aged code-base ever) The combined approach takes a > >> change that's already non-trivially dealing with complex subsystem > >> changes and injects a bunch of trivial renaming noise across unrelated > >> subsystems into the signal of an actual logic refactor. > >> > >> On Thu, Mar 22, 2018 at 9:31 AM, Aleksey Yeshchenko <alek...@apple.com> > >> wrote: > >>> Poor and out-of-date naming of things is probably the least serious > part > >> of our technical debt. Bad factoring, and straight-up > >>> poorly written components is where it’s really at. > >>> > >>> Doing a big rename for rename sake alone does more harm than it is > good, > >> sometimes. Some of us have big patches > >>> in flight, things that actually pay off some technical debt, and > dealing > >> with such renames is rebase hell :\ > >>> > >>> That said, there is good time for such renames - it’s during those > major > >> refactors and rewrites. When you are > >>> changing a subsystem, might as well do the appropriate renames. > >>> > >>> — > >>> AY > >>> > >>> On 20 March 2018 at 22:04:48, Jon Haddad (j...@jonhaddad.com) wrote: > >>> > >>> Whenever I hop around in the codebase, one thing that always manages to > >> slow me down is needing to understand the context of the variable names > >> that I’m looking at. We’ve now removed thrift the transport, but the > >> variables, classes and comments still remain. Personally, I’d like to > go in > >> and pay off as much technical debt as possible by refactoring the code > to > >> be as close to CQL as possible. Rows should be rows, not partitions, I’d > >> love to see the term column family removed forever in favor of always > using > >> tables. That said, it’s a big task. I did a quick refactor in a branch, > >> simply changing the ColumnFamilyStore class to TableStore, and pushed > it up > >> to GitHub. [1] > >>> > >>> Didn’t click on the link? That’s ok. The TL;DR is that it’s almost 2K > >> LOC changed across 275 files. I’ll note that my branch doesn’t change > any > >> of the almost 1000 search results of “columnfamilystore” found in the > >> codebase and hundreds of tests failed on my branch in CircleCI, so that > 2K > >> LOC change would probably be quite a bit bigger. There is, of course, a > lot > >> more than just renaming this one class, there’s thousands of variable > names > >> using any manner of “cf”, “cfs”, “columnfamily”, names plus comments and > >> who knows what else. There’s lots of references in probably every file > that > >> would have to get updated. > >>> > >>> What are people’s thoughts on this? We should be honest with ourselves > >> and know this isn’t going to get any easier over time. It’s only going > to > >> get more confusing for new people to the project, and having to figure > out > >> “what kind of row am i even looking at” is a waste of time. There’s > >> obviously a much bigger impact than just renaming a bunch of files, > there’s > >> any number of patches and branches that would become outdated, plus > anyone > >> pulling in Cassandra as a dependency would be affected. I don’t really > have > >> a solution for the disruption other than “leave it in place”, but in my > >> mind that’s not a great (or even good) solution. > >>> > >>> Anyways, enough out of me. My concern for ergonomics and naming might > be > >> significantly higher than the rest of the folks working in the code, > and I > >> wanted to put a feeler out there before I decided to dig into this in a > >> more serious manner. > >>> > >>> Jon > >>> > >>> [1] https://github.com/apache/cassandra/compare/trunk... > >> rustyrazorblade:refactor_column_family_store?expand=1 < > >> https://github.com/apache/cassandra/compare/trunk... > >> rustyrazorblade:refactor_column_family_store?expand=1> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> > >> > >