[ 
https://issues.apache.org/jira/browse/CASSANDRA-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593527#comment-13593527
 ] 

Sylvain Lebresne commented on CASSANDRA-3929:
---------------------------------------------

bq. I think if we make users think about columns we have lost. It should really 
be defined in terms of cql3 rows per partition.

I agree with that and I didn't mean to imply that. I was just talking of 
columns and rows as in "the internal storage" to explain my concern. And 
technically, taking of number of internal columns inside internal rows or cql3 
rows inside partition doesn't really change the problem or how to solve it 
(after all, for some layout, cql3 rows correspond one to one to internal 
columns).

bq. Put another way, we shouldn't leak implementation details

I fully agree, that was my point :)

bq. We should have logic on the query path to "pretend that compaction is 
perfect."

I agree we should. What logic exactly is another matter. Again, if we do ignore 
deletes, then I suppose we could fix the "slice" problem I've described above 
if we make it so that a read on a CF with a max_cql_rows setting always read a 
row from the beginning (up until the data really queried). That way, we could 
say "that column is here but we should pretend it's not". But doing that would 
be pretty painful in practice, and would have a non-negligible performance cost.

But if you throw deletes, I'm honestly not sure it's possible to implement the 
"pretend that compaction is perfect." at all honestly. The problem is, say your 
max_cql_rows is N, and you insert N+1 cql3 rows. And then you delete one of the 
N first column. Now you are dependent of whether the N+1th row had been deleted 
by compaction already or not. The 2 only way to deal with that I can see is 
either:
# not deleting any column internally (but do the pretending client side) in 
case some deletes comes in. But that defeats the whole purpose of the ticket.
# apply the truncation at writing time. I.e, you say that as soon as you insert 
the N+1th column in a row, then whatever N+1th is the tail of that row 
disappear right away. But that means full row read on every write, not an 
option either.
                
> Support row size limits
> -----------------------
>
>                 Key: CASSANDRA-3929
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3929
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: ponies
>             Fix For: 2.0
>
>         Attachments: 3929_b.txt, 3929_c.txt, 3929_d.txt, 3929_e.txt, 
> 3929_f.txt, 3929_g_tests.txt, 3929_g.txt, 3929.txt
>
>
> We currently support expiring columns by time-to-live; we've also had 
> requests for keeping the most recent N columns in a row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to