[
https://issues.apache.org/jira/browse/CASSANDRA-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359128#comment-15359128
]
Paulo Motta edited comment on CASSANDRA-11670 at 7/1/16 3:31 PM:
-----------------------------------------------------------------
Instead of having separate paths for small and large batches, I think it's
simpler and cleaner to redesign our batchlog table to expand the current
{{list<mutation>}} column into clustered rows, so we can append mutations
individually to the same batchlog partition without being restricted to
{{max_mutation_size_in_kb}} for the total batchlog size.
The idea is to have something like
{noformat}
CREATE TABLE Batches (
id timeuuid,
idx bigint,
mutation blob,
version int static,
active boolean static,
PRIMARY KEY ((id), idx)
)
{noformat}
So, creating a batch is a matter of populating a partition with mutations and
then setting the {{active}} flag to true, what will indicate the batch is ready
to be potentially replayed (building on [~carlyeks]'s suggestion).
In order to verify the potential performance impact of this change, I ran 3
cstar tests with 3 different implementations and the throughput/latency doesn't
seem to be impacted with this change.
The 3 compared branches are:
* [3.0-noreplay|https://github.com/pauloricardomg/cassandra/tree/3.0-noreplay]
(3.0 with disabled batchlog replay - so it's comparable with others)
* [11670-v2|https://github.com/pauloricardomg/cassandra/tree/3.0-11670-v2]
(table above)
* [11670-v3|https://github.com/pauloricardomg/cassandra/tree/3.0-11670-v3]
(alternative design where table is clustered, but mutations are stored as a
{{list<blob>}} in order to have fewer rows - schema below)
** {noformat}
CREATE TABLE Batches (
id timeuuid,
idx bigint,"
mutations list<blob>,"
version int static,"
active boolean static,"
PRIMARY KEY ((id), idx)
)
{noformat}
The cstar tests are:
* [1 materialized
view|http://cstar.datastax.com/graph?command=one_job&stats=01de5e5e-3ed3-11e6-8a53-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2690.27&ymin=0&ymax=70239.4]
* [3 materialized
views|http://cstar.datastax.com/graph?command=one_job&stats=0a632460-3edd-11e6-85ce-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=5128.86&ymin=0&ymax=2832.5]
* [cqlstress-example.yaml (multi-partition
batches)|http://cstar.datastax.com/graph?command=one_job&stats=0a632460-3edd-11e6-85ce-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=5128.86&ymin=0&ymax=2832.5]
As said before, from these results there doesn't seem to be any impact on
throughput/latency from switching to this approach. If we decide to go with
this little change will be necessary in the batchlog handling code to support
it, most of the effort will probably on supporting upgrade to this new scheme.
I'm not sure if there any other potential issues with turning the batchlog into
a wide table and applying mutations individually, but if not I think we should
go with this approach. WDYT [~carlyeks] [~iamaleksey]?
was (Author: pauloricardomg):
Instead of having separate paths for small and large batches, I think it's
simpler and cleaner to redesign our batchlog table to expand the current
{{list<mutation>}} column into clustered rows, so we can append mutations
individually to the same batchlog partition without being restricted to
{{max_mutation_size_in_kb}} for the total batchlog size.
The idea is to have something like
{noformat}
CREATE TABLE Batches (
id timeuuid,
idx bigint,
mutation blob,
version int static,
active boolean static,
PRIMARY KEY ((id), idx)
)
{noformat}
So, creating a batch is a matter of populating a partition with mutations and
then setting the {{active}} flag to true, what will indicate the batch is ready
to be potentially replayed.
In order to verify the potential performance impact of this change, I ran 3
cstar tests with 3 different implementations and the throughput/latency doesn't
seem to be impacted with this change.
The 3 compared branches are:
* [3.0-noreplay|https://github.com/pauloricardomg/cassandra/tree/3.0-noreplay]
(3.0 with disabled batchlog replay - so it's comparable with others)
* [11670-v2|https://github.com/pauloricardomg/cassandra/tree/3.0-11670-v2]
(table above)
* [11670-v3|https://github.com/pauloricardomg/cassandra/tree/3.0-11670-v3]
(alternative design where table is clustered, but mutations are stored as a
{{list<blob>}} in order to have fewer rows - schema below)
** {noformat}
CREATE TABLE Batches (
id timeuuid,
idx bigint,"
mutations list<blob>,"
version int static,"
active boolean static,"
PRIMARY KEY ((id), idx)
)
{noformat}
The cstar tests are:
* [1 materialized
view|http://cstar.datastax.com/graph?command=one_job&stats=01de5e5e-3ed3-11e6-8a53-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2690.27&ymin=0&ymax=70239.4]
* [3 materialized
views|http://cstar.datastax.com/graph?command=one_job&stats=0a632460-3edd-11e6-85ce-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=5128.86&ymin=0&ymax=2832.5]
* [cqlstress-example.yaml (multi-partition
batches)|http://cstar.datastax.com/graph?command=one_job&stats=0a632460-3edd-11e6-85ce-0256e416528f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=5128.86&ymin=0&ymax=2832.5]
As said before, from these results there doesn't seem to be any impact on
throughput/latency from switching to this approach. If we decide to go with
this little change will be necessary in the batchlog handling code to support
it, most of the effort will probably on supporting upgrade to this new scheme.
I'm not sure if there any other potential issues with turning the batchlog into
a wide table and applying mutations individually, but if not I think we should
go with this approach. WDYT [~carlyeks] [~iamaleksey]?
> Error while waiting on bootstrap to complete. Bootstrap will have to be
> restarted. Stream failed
> ------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11670
> Project: Cassandra
> Issue Type: Bug
> Components: Configuration, Streaming and Messaging
> Reporter: Anastasia Osintseva
> Assignee: Paulo Motta
> Fix For: 3.0.x
>
>
> I have in cluster 2 DC, in each DC - 2 Nodes. I wanted to add 1 node to each
> DC. One node has been added successfully after I had made scrubing.
> Now I'm trying to add node to another DC, but get error:
> org.apache.cassandra.streaming.StreamException: Stream failed.
> After scrubing and repair I get the same error.
> {noformat}
> ERROR [StreamReceiveTask:5] 2016-04-27 00:33:21,082 Keyspace.java:492 -
> Unknown exception caught while attempting to update MaterializedView!
> messages_dump.messages
> java.lang.IllegalArgumentException: Mutation of 34974901 bytes is too large
> for the maxiumum size of 33554432
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:264)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:469)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:146)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:724)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.db.view.ViewManager.pushViewReplicaUpdates(ViewManager.java:149)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:487)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyUnsafe(Mutation.java:236)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:169)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_11]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
> ERROR [StreamReceiveTask:5] 2016-04-27 00:33:21,082
> StreamReceiveTask.java:214 - Error applying streamed data:
> java.lang.IllegalArgumentException: Mutation of 34974901 bytes is too large
> for the maxiumum size of 33554432
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:264)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:469)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:146)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:724)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.db.view.ViewManager.pushViewReplicaUpdates(ViewManager.java:149)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:487)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyUnsafe(Mutation.java:236)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:169)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_11]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
> ERROR [StreamReceiveTask:5] 2016-04-27 00:33:21,082 StreamSession.java:520 -
> [Stream #f849ffe0-0bee-11e6-9b5f-d16a1b9764ab] Streaming error occurred
> java.lang.IllegalArgumentException: Mutation of 34974901 bytes is too large
> for the maxiumum size of 33554432
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:264)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:469)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:146)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:724)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.db.view.ViewManager.pushViewReplicaUpdates(ViewManager.java:149)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:487)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:384)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:205)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:217)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at org.apache.cassandra.db.Mutation.applyUnsafe(Mutation.java:236)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:169)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_11]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
> DEBUG [StreamReceiveTask:5] 2016-04-27 00:33:21,082
> ConnectionHandler.java:110 - [Stream #f849ffe0-0bee-11e6-9b5f-d16a1b9764ab]
> Closing stream connection handler on /88.9.99.92
> DEBUG [STREAM-OUT-/88.9.99.92] 2016-04-27 00:33:21,082
> ConnectionHandler.java:341 - [Stream #f849ffe0-0bee-11e6-9b5f-d16a1b9764ab]
> Sending Session Failed
> INFO [StreamReceiveTask:5] 2016-04-27 00:33:21,082
> StreamResultFuture.java:182 - [Stream #f849ffe0-0bee-11e6-9b5f-d16a1b9764ab]
> Session with /88.9.99.92 is complete
> WARN [StreamReceiveTask:5] 2016-04-27 00:33:21,182
> StreamResultFuture.java:209 - [Stream #f849ffe0-0bee-11e6-9b5f-d16a1b9764ab]
> Stream failed
> ERROR [main] 2016-04-27 00:33:21,259 StorageService.java:1300 - Error while
> waiting on bootstrap to complete. Bootstrap will have to be restarted.
> java.util.concurrent.ExecutionException:
> org.apache.cassandra.streaming.StreamException: Stream failed
> at
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
> ~[guava-18.0.jar:na]
> at
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1295)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:971)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:745)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:610)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:333)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551)
> [apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679)
> [apache-cassandra-3.0.5.jar:3.0.5]
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> ~[guava-18.0.jar:na]
> at
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:525)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:216)
> ~[apache-cassandra-3.0.5.jar:3.0.5]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> ~[na:1.8.0_11]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> ~[na:1.8.0_11]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_11]
> {noformat}
> I set commitlog_segment_size_in_mb: 128, but it didn't help.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)