[
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077020#comment-14077020
]
Russell Alexander Spitzer commented on CASSANDRA-7631:
------------------------------------------------------
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason
I would like this as part of stress is that we already have all the data
generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we
could prepare for a test that uses a large amount of data and a mixed workload
much faster.
> Allow Stress to write directly to SSTables
> ------------------------------------------
>
> Key: CASSANDRA-7631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Russell Alexander Spitzer
> Assignee: Russell Alexander Spitzer
>
> One common difficulty with benchmarking machines is the amount of time it
> takes to initially load data. For machines with a large amount of ram this
> becomes especially onerous because a very large amount of data needs to be
> placed on the machine before page-cache can be circumvented.
> To remedy this I suggest we add a top level flag to Cassandra-Stress which
> would cause the tool to write directly to sstables rather than actually
> performing CQL inserts. Internally this would use CQLSStable writer to write
> directly to sstables while skipping any keys which are not owned by the node
> stress is running on. The same stress command run on each node in the cluster
> would then write unique sstables only containing data which that node is
> responsible for. Following this no further network IO would be required to
> distribute data as it would all already be correctly in place.
--
This message was sent by Atlassian JIRA
(v6.2#6252)