[
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077050#comment-14077050
]
Brandon Williams commented on CASSANDRA-7631:
---------------------------------------------
bq. Stress seems like a perfectly reasonable place to put this, really. It also
means we know the data generated is compatible with the stress workload, which
is important.
I agree with your latter point, but we could still reuse the code in a separate
utility. It just seems like stress has enough options as it is, and
introducing an sstable writer would make a lot of them nonsensical (like
consistency level, replication, etc.) I'd somewhat prefer having a clear
delineation, util-wise, between going over the network and writing to disk.
> Allow Stress to write directly to SSTables
> ------------------------------------------
>
> Key: CASSANDRA-7631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Russell Alexander Spitzer
> Assignee: Russell Alexander Spitzer
>
> One common difficulty with benchmarking machines is the amount of time it
> takes to initially load data. For machines with a large amount of ram this
> becomes especially onerous because a very large amount of data needs to be
> placed on the machine before page-cache can be circumvented.
> To remedy this I suggest we add a top level flag to Cassandra-Stress which
> would cause the tool to write directly to sstables rather than actually
> performing CQL inserts. Internally this would use CQLSStable writer to write
> directly to sstables while skipping any keys which are not owned by the node
> stress is running on. The same stress command run on each node in the cluster
> would then write unique sstables only containing data which that node is
> responsible for. Following this no further network IO would be required to
> distribute data as it would all already be correctly in place.
--
This message was sent by Atlassian JIRA
(v6.2#6252)