[ 
https://issues.apache.org/jira/browse/CASSANDRA-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096080#comment-13096080
 ] 

Sylvain Lebresne commented on CASSANDRA-3122:
---------------------------------------------

bq. I don't understand how the changes to writeRow work without doing anything 
to cF b/s asking for its serializedSize

In (the new method) getColumnFamily, when we reuse a previous column family to 
add new columns to it, we start by removing it's size from the estimate, so 
that when writeRow is called on the updated cf, by adding the whole size we 
should still have a good estimate (actually a better one that before because we 
don't count the row key multiple times anymore).

> SSTableSimpleUnsortedWriter take long time when inserting big rows
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3122
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3122
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Benoit Perroud
>            Priority: Minor
>             Fix For: 0.8.5
>
>         Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, 
> SSTableSimpleUnsortedWriter.patch
>
>
> In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of 
> columns, if we call newRow several times (to flush data as soon as possible), 
> the time taken by the newRow() call is increasing non linearly. This is 
> because when newRow is called, we merge the size increasing existing CF with 
> the new one.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to