[ 
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361195#comment-14361195
 ] 

Sylvain Lebresne commented on CASSANDRA-8099:
---------------------------------------------

bq. I think it's a shame this patch wasn't attempted at least a little more 
incrementally.

I certainly understand the criticism, this is definitively not as incremental 
as it should be. My lame "defence" is that since this structurally changes the 
main abstraction used by the storage engine, it quickly trickles down to 
everything else, so that I just wasn't sure how to attack this more 
incrementally in practice. For the serialization formats, I could indeed have 
stick to serializing to the old format, but given the mismatch between the old 
format and the new abstractions, it was actually simpler to just write in a 
meaningful format right away (it allowed me to get something working faster).  
And since the new serialization format details are fairly well encapsulated 
(mostly in {{AtomSerializer.java}}), I'll admit it didn't felt like a huge deal 
overall. But in any case, I probably haven't tried hard enough and/or I'm not 
smart enough to have figured out how to make that happen more incrementally and 
for that, I apologize.

bq. I'm also worried I'm finding myself saying "too close to release to 
question this decision"

I agree that not questioning a decision that you think is worth questioning 
should be avoided, but I also don't think that this needs to be the case. If 
you think a decision make things worth than it is in current trunk, then by all 
mean, let's bring it. If there is enough such concerns voiced that makes us 
think this patch won't be a net improvement over the status quo and there is no 
time to address those concerns, then I'll be the first to suggest that, as sad 
as that would make me, we should consider pushing it after 3.0 (but I do have 
the weakness to think that the patch is a net improvement).

Now, I don't pretend that every choice made here is absolutely optimal (I'm 
afraid I'm not that smart) so there will things that can be improved (and maybe 
some will require subsequent changes). But as long as something doesn't make 
things worth than they currently are, I'd suggest is probably ok to just create 
tickets for those improvements. After all, this isn't meant at all to be the 
definitive version of Cassandra ode, it just pretend to be cleaner grounds to 
improve upon than we currently have.

Don't get me wrong, I'm not trying to say that such a big patch is ideal, it's 
not. I just didn't figured out how to do better.

> Refactor and modernize the storage engine
> -----------------------------------------
>
>                 Key: CASSANDRA-8099
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0
>
>         Attachments: 8099-nit
>
>
> The current storage engine (which for this ticket I'll loosely define as "the 
> code implementing the read/write path") is suffering from old age. One of the 
> main problem is that the only structure it deals with is the cell, which 
> completely ignores the more high level CQL structure that groups cell into 
> (CQL) rows.
> This leads to many inefficiencies, like the fact that during a reads we have 
> to group cells multiple times (to count on replica, then to count on the 
> coordinator, then to produce the CQL resultset) because we forget about the 
> grouping right away each time (so lots of useless cell names comparisons in 
> particular). But outside inefficiencies, having to manually recreate the CQL 
> structure every time we need it for something is hindering new features and 
> makes the code more complex that it should be.
> Said storage engine also has tons of technical debt. To pick an example, the 
> fact that during range queries we update {{SliceQueryFilter.count}} is pretty 
> hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has 
> to go into to simply "remove the last query result".
> So I want to bite the bullet and modernize this storage engine. I propose to 
> do 2 main things:
> # Make the storage engine more aware of the CQL structure. In practice, 
> instead of having partitions be a simple iterable map of cells, it should be 
> an iterable list of row (each being itself composed of per-column cells, 
> though obviously not exactly the same kind of cell we have today).
> # Make the engine more iterative. What I mean here is that in the read path, 
> we end up reading all cells in memory (we put them in a ColumnFamily object), 
> but there is really no reason to. If instead we were working with iterators 
> all the way through, we could get to a point where we're basically 
> transferring data from disk to the network, and we should be able to reduce 
> GC substantially.
> Please note that such refactor should provide some performance improvements 
> right off the bat but it's not it's primary goal either. It's primary goal is 
> to simplify the storage engine and adds abstraction that are better suited to 
> further optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to