[
https://issues.apache.org/jira/browse/CASSANDRA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Dejanovski updated CASSANDRA-11105:
---------------------------------------------
Attachment: 11105-trunk.txt
Here's a patch for trunk that fixes cassandra-stress usage of batches.
Currently there's indeed a problem as all rows that are part of an operation
will get batched together (limited to 65k queries at once), whatever select
distribution has been defined in the yaml file.
It is also contradictory with the output at the beginning of the stress runs
that gives the batch size distribution. Even when it states that it will
generate single row batches, it still batches all rows together.
The patch sends the batch size from StressProfile to SchemaInsert and computes
a new batch size within the bounds for each operation (given batch size is a
distribution, not a fixed value).
It also uses asynchronous queries instead of synchronous ones in order to match
performance best practices.
> cassandra-stress tool - InvalidQueryException: Batch too large
> --------------------------------------------------------------
>
> Key: CASSANDRA-11105
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11105
> Project: Cassandra
> Issue Type: Bug
> Components: Tools
> Environment: Cassandra 2.2.4, Java 8, CentOS 6.5
> Reporter: Ralf Steppacher
> Attachments: 11105-trunk.txt, batch_too_large.yaml
>
>
> I am using Cassandra 2.2.4 and I am struggling to get the cassandra-stress
> tool to work for my test scenario. I have followed the example on
> http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema
> to create a yaml file describing my test (attached).
> I am collecting events per user id (text, partition key). Events have a
> session type (text), event type (text), and creation time (timestamp)
> (clustering keys, in that order). Plus some more attributes required for
> rendering the events in a UI. For testing purposes I ended up with the
> following column spec and insert distribution:
> {noformat}
> columnspec:
> - name: created_at
> cluster: uniform(10..10000)
> - name: event_type
> size: uniform(5..10)
> population: uniform(1..30)
> cluster: uniform(1..30)
> - name: session_type
> size: fixed(5)
> population: uniform(1..4)
> cluster: uniform(1..4)
> - name: user_id
> size: fixed(15)
> population: uniform(1..1000000)
> - name: message
> size: uniform(10..100)
> population: uniform(1..100B)
> insert:
> partitions: fixed(1)
> batchtype: UNLOGGED
> select: fixed(1)/1200000
> {noformat}
> Running stress tool for just the insert prints
> {noformat}
> Generating batches with [1..1] partitions and [0..1] rows (of [10..1200000]
> total rows in the partitions)
> {noformat}
> and then immediately starts flooding me with
> {{com.datastax.driver.core.exceptions.InvalidQueryException: Batch too
> large}}.
> Why I should be exceeding the {{batch_size_fail_threshold_in_kb: 50}} in the
> {{cassandra.yaml}} I do not understand. My understanding is that the stress
> tool should generate one row per batch. The size of a single row should not
> exceed {{8+10*3+5*3+15*3+100*3 = 398 bytes}}. Assuming a worst case of all
> text characters being 3 byte unicode characters.
> This is how I start the attached user scenario:
> {noformat}
> [rsteppac@centos bin]$ ./cassandra-stress user
> profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose
> file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node
> 10.211.55.8
> INFO 08:00:07 Did not find Netty's native epoll transport in the classpath,
> defaulting to NIO.
> INFO 08:00:08 Using data-center name 'datacenter1' for
> DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct
> datacenter name with DCAwareRoundRobinPolicy constructor)
> INFO 08:00:08 New Cassandra host /10.211.55.8:9042 added
> Connected to cluster: Titan_DEV
> Datatacenter: datacenter1; Host: /10.211.55.8; Rack: rack1
> Created schema. Sleeping 1s for propagation.
> Generating batches with [1..1] partitions and [0..1] rows (of [10..1200000]
> total rows in the partitions)
> com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large
> at
> com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
> at
> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:271)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:185)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55)
> at
> org.apache.cassandra.stress.operations.userdefined.SchemaInsert$JavaDriverRun.run(SchemaInsert.java:87)
> at
> org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:159)
> at
> org.apache.cassandra.stress.operations.userdefined.SchemaInsert.run(SchemaInsert.java:119)
> at
> org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:309)
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Batch
> too large
> at
> com.datastax.driver.core.Responses$Error.asException(Responses.java:125)
> at
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:120)
> at
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186)
> at
> com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:45)
> at
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:752)
> at
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:576)
> at
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1003)
> at
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:925)
> at
> com.datastax.shaded.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> at
> com.datastax.shaded.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> at
> com.datastax.shaded.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> at
> com.datastax.shaded.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at
> com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> at
> com.datastax.shaded.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> at
> com.datastax.shaded.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at
> com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at
> com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> ...
> {noformat}
> The C* log:
> {noformat}
> INFO 08:00:04 Listening for thrift clients...
> WARN 08:00:07 Detected connection using native protocol version 2. Both
> version 1 and 2 of the native protocol are now deprecated and support will be
> removed in Cassandra 3.0. You are encouraged to upgrade to a client driver
> using version 3 of the native protocol
> ERROR 08:00:14 Batch of prepared statements for [stresscql.batch_too_large]
> is of size 58024, exceeding specified threshold of 51200 by 6824. (see
> batch_size_fail_threshold_in_kb)
> ERROR 08:00:15 Batch of prepared statements for [stresscql.batch_too_large]
> is of size 77985, exceeding specified threshold of 51200 by 26785. (see
> batch_size_fail_threshold_in_kb)
> ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)