[ 
https://issues.apache.org/jira/browse/CASSANDRA-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097066#comment-13097066
 ] 

Sylvain Lebresne commented on CASSANDRA-3122:
---------------------------------------------

bq. every time newRow is called, serializedSize iterate through all the columns 
to compute the size

Yes and I agree this ain't the more efficient thing ever, though I would kind 
of be surprised this would be a bottleneck. Anyway, I don't oppose improving 
this, but we should create a new ticket for that.

bq. An improvement in bulk loading would be to use a "single threaded" 
ColumFamily for bulk loading.

Yes, but we'll do it in 1.0 only because we have CASSANDRA-2843 there that 
basically make this trivial, while this is uglier to do without it.

> SSTableSimpleUnsortedWriter take long time when inserting big rows
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3122
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3122
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Benoit Perroud
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.8.5
>
>         Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, 
> SSTableSimpleUnsortedWriter.patch
>
>
> In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of 
> columns, if we call newRow several times (to flush data as soon as possible), 
> the time taken by the newRow() call is increasing non linearly. This is 
> because when newRow is called, we merge the size increasing existing CF with 
> the new one.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to