[ 
https://issues.apache.org/jira/browse/CASSANDRA-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737408#action_12737408
 ] 

Jonathan Ellis commented on CASSANDRA-16:
-----------------------------------------

We have a bigger problem.

We rely on knowing the total size of the serialized columns to be able to seek 
around the sstable.  But we can't write that data at the start without making 
two passes (the first to compute the size).  Obviously writing it at the end is 
a nonstarter since we'd have no way to know where the end is, absent the size 
information.

Bigtable doesn't seem to have found a way out of this either, limiting the data 
associated with a key to 64KB (see section 4).

I'd rather limit the size (2GB is the current limit, which is more reasonable 
than 64KB I think) than make two passes in compaction.  Huge rows seems almost 
like a misfeature given the key-oriented partitioner design.

> Memory efficient compactions 
> -----------------------------
>
>                 Key: CASSANDRA-16
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: All
>            Reporter: Sandeep Tata
>            Priority: Critical
>             Fix For: 0.5
>
>
> The basic idea is to allow rows to get large enough that they don't have to 
> fit in memory entirely, but can easily fit on a disk. The compaction 
> algorithm today de-serializes the entire row in memory before writing out the 
> compacted SSTable (see ColumnFamilyStore.doCompaction() and associated 
> methods).
> The requirement is to have a compaction method with a lower memory 
> requirement so we can support rows larger than available main memory. To 
> re-use the old FB example, if we stored a user's inbox in a row, we'd want 
> the inbox to grow bigger than memory so long as it fit on disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to