[jira] [Commented] (CASSANDRA-6737) A batch statements on a single partition should not create a new CF object for each update
[ https://issues.apache.org/jira/browse/CASSANDRA-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080952#comment-15080952 ] Sylvain Lebresne commented on CASSANDRA-6737: - bq. Would you say there's any limitation/recommendation regarding the number of statements contained in a single partition batch (or the summarized size in kb)? A single partition batch is internally a single mutation, so unless I've missed some recent changes to the commit log, you're hard-limited by the size of a commit log segment, and believe by default we actually limit that to half of the segment, so 16MB (see {{max_mutation_size_in_kb}} in the yaml). Now, I'd really appreciate it if you could use the mailing list for such question as it is a more appropriate venue (especially since the question is barely related to the original ticket). > A batch statements on a single partition should not create a new CF object > for each update > -- > > Key: CASSANDRA-6737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6737 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Labels: performance > Fix For: 2.0.6 > > Attachments: 6737.2.patch, 6737.txt > > > BatchStatement creates a new ColumnFamily object (as well as a new > RowMutation object) for every update in the batch, even if all those update > are actually on the same partition. This is particularly inefficient when > bulkloading data into a single partition (which is not all that uncommon). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6737) A batch statements on a single partition should not create a new CF object for each update
[ https://issues.apache.org/jira/browse/CASSANDRA-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073752#comment-15073752 ] Martin Grotzke commented on CASSANDRA-6737: --- [~slebresne] Would you say there's any limitation/recommendation regarding the number of statements contained in a single partition batch (or the summarized size in kb)? Does a RowMutation for single partition batch statements become more "expensive" if it contains more statements, e.g. does the heap pressure grow with the number of contained statements? (this question is related to my [comment in CASSANDRA-6487|https://issues.apache.org/jira/browse/CASSANDRA-6487?focusedCommentId=15059781&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059781]) > A batch statements on a single partition should not create a new CF object > for each update > -- > > Key: CASSANDRA-6737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6737 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Labels: performance > Fix For: 2.0.6 > > Attachments: 6737.2.patch, 6737.txt > > > BatchStatement creates a new ColumnFamily object (as well as a new > RowMutation object) for every update in the batch, even if all those update > are actually on the same partition. This is particularly inefficient when > bulkloading data into a single partition (which is not all that uncommon). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6737) A batch statements on a single partition should not create a new CF object for each update
[ https://issues.apache.org/jira/browse/CASSANDRA-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906706#comment-13906706 ] Sylvain Lebresne commented on CASSANDRA-6737: - bq. FTR I'm at the point where other things being equal I'd prefer to put optimizations in 2.1. FTR, I agree in general. But in this case, we have a rather simple to fix bottleneck that makes a non-really crazy use case be 20 times slower than what you would go with thrift. At this point, it's not really an optimization imo, it's a bug that needs to be fixed. But I do not mean this ticket to be "let's do every optimizations for BatchStatement on single partition that comes into mind", that would definitively belong to 2.1. > A batch statements on a single partition should not create a new CF object > for each update > -- > > Key: CASSANDRA-6737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6737 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 2.0.6 > > Attachments: 6737.txt > > > BatchStatement creates a new ColumnFamily object (as well as a new > RowMutation object) for every update in the batch, even if all those update > are actually on the same partition. This is particularly inefficient when > bulkloading data into a single partition (which is not all that uncommon). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6737) A batch statements on a single partition should not create a new CF object for each update
[ https://issues.apache.org/jira/browse/CASSANDRA-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906007#comment-13906007 ] Aleksey Yeschenko commented on CASSANDRA-6737: -- (I'd prefer this one to go into 2.0) > A batch statements on a single partition should not create a new CF object > for each update > -- > > Key: CASSANDRA-6737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6737 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 2.0.6 > > Attachments: 6737.txt > > > BatchStatement creates a new ColumnFamily object (as well as a new > RowMutation object) for every update in the batch, even if all those update > are actually on the same partition. This is particularly inefficient when > bulkloading data into a single partition (which is not all that uncommon). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6737) A batch statements on a single partition should not create a new CF object for each update
[ https://issues.apache.org/jira/browse/CASSANDRA-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905931#comment-13905931 ] Jonathan Ellis commented on CASSANDRA-6737: --- FTR I'm at the point where other things being equal I'd prefer to put optimizations in 2.1. > A batch statements on a single partition should not create a new CF object > for each update > -- > > Key: CASSANDRA-6737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6737 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 2.0.6 > > Attachments: 6737.txt > > > BatchStatement creates a new ColumnFamily object (as well as a new > RowMutation object) for every update in the batch, even if all those update > are actually on the same partition. This is particularly inefficient when > bulkloading data into a single partition (which is not all that uncommon). -- This message was sent by Atlassian JIRA (v6.1.5#6160)