I really don't think anyone has been recently against such renaming, and in fact, a _lot_ of renaming *has* already happen over time. The problem, as you carefully noted, is that it's such a big task that there is still a lot to do. Anyway, I've yet to see a patch renaming things to match the CQL naming scheme be rejected, so I'd personally encourage such submission. But maybe with a few caveats (already mentioned largely, so repeating here to signify my personal agreement with them): - renaming with large surface area can be painful for ongoing patches or even future merge. That's not a reason for not doing them, but that's imo a good enough reason to do things incrementally/in as-small-as-reasonable steps. Making sure a renaming commit only does renaming and doesn't change the logic is also pretty nice when you rebase such things. - breaking hundreds of tests is obviously not ok :) - pure code renaming is one reasonably simple aspect, but quite a few renaming may have user visible impact. Particularly around JMX where many things are name based on their class, and to a lesser extend some of our tools still use "old" naming. We can't and shouldn't ignore those impact: such user visible changes should imo be documented, and we should make sure we have a reasonably painless (and thus incremental) upgrade path. My hunch is the latter isn't as simple as it seems.
-- Sylvain On Wed, Mar 21, 2018 at 9:06 AM kurt greaves <k...@instaclustr.com> wrote: > As someone who came to the codebase post CQL but prior to thrift being > removed, +1 to refactor. The current mixing of terminology is a complete > nightmare. This would also give a good opportunity document a lot of code > that simply isn't documented (or incorrect). I'd say it's worth doing it in > multiple steps though, such as refactor of a single class at a time, then > followed by refactor of variable names. We've already done one pretty big > refactor (InetAddressAndPort) for 4.0, I don't see how a few more could > make it any worse (lol). > > Row vs partition vs key vs PK is killing me > > On 20 March 2018 at 22:04, Jon Haddad <j...@jonhaddad.com> wrote: > > > Whenever I hop around in the codebase, one thing that always manages to > > slow me down is needing to understand the context of the variable names > > that I’m looking at. We’ve now removed thrift the transport, but the > > variables, classes and comments still remain. Personally, I’d like to go > > in and pay off as much technical debt as possible by refactoring the code > > to be as close to CQL as possible. Rows should be rows, not partitions, > > I’d love to see the term column family removed forever in favor of always > > using tables. That said, it’s a big task. I did a quick refactor in a > > branch, simply changing the ColumnFamilyStore class to TableStore, and > > pushed it up to GitHub. [1] > > > > Didn’t click on the link? That’s ok. The TL;DR is that it’s almost 2K > > LOC changed across 275 files. I’ll note that my branch doesn’t change > any > > of the almost 1000 search results of “columnfamilystore” found in the > > codebase and hundreds of tests failed on my branch in CircleCI, so that > 2K > > LOC change would probably be quite a bit bigger. There is, of course, a > > lot more than just renaming this one class, there’s thousands of variable > > names using any manner of “cf”, “cfs”, “columnfamily”, names plus > comments > > and who knows what else. There’s lots of references in probably every > file > > that would have to get updated. > > > > What are people’s thoughts on this? We should be honest with ourselves > > and know this isn’t going to get any easier over time. It’s only going > to > > get more confusing for new people to the project, and having to figure > out > > “what kind of row am i even looking at” is a waste of time. There’s > > obviously a much bigger impact than just renaming a bunch of files, > there’s > > any number of patches and branches that would become outdated, plus > anyone > > pulling in Cassandra as a dependency would be affected. I don’t really > > have a solution for the disruption other than “leave it in place”, but in > > my mind that’s not a great (or even good) solution. > > > > Anyways, enough out of me. My concern for ergonomics and naming might be > > significantly higher than the rest of the folks working in the code, and > I > > wanted to put a feeler out there before I decided to dig into this in a > > more serious manner. > > > > Jon > > > > [1] > > > https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1 > > < > > > https://github.com/apache/cassandra/compare/trunk...rustyrazorblade:refactor_column_family_store?expand=1 > > > >