[ 
https://issues.apache.org/jira/browse/CASSANDRA-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350914#comment-14350914
 ] 

Robert Stupp commented on CASSANDRA-8929:
-----------------------------------------

That's what I meant: a tool that operates on the recorded statements. Recording 
the CQL statements (along with some state like pstmts) doesn't feel to be 
super-complicated or super-intrusive in the code path. It clearly adds some 
overhead on CPU and I/O if turned on - also some contention (multiple 
connections against a single trace). In the worst case it could slow down the 
node if we don't handle that situation (e.g. drop some trace information if 
trace disk's too slow).

Technically that recording would be a trace of everything "on the wire" 
enriched by some additional information like a dump of all prepared statements 
at beginning of the trace.
We could get a lot of information from such a trace. Not just every native 
protocol operation but also network related information like number of 
established or closed connections or whether a connection uses SSL, is 
authenticated and so on.

Regarding the goal: I don't just only see upgrade-acceptance-tests as a goal. 
Also a possibility to analyze operations that happen during some time frame - 
as part of "bug fixing" or regular QA. Also useful to compare workloads of 
client application versions. To be clear: IMO it's not meant to be some kind of 
"security audit".

A minimalistic playback tool would just issue N percent (1..100) of all 
contained DML statements and maybe simulate multiple connections. Everything on 
top of that would be out of scope (for now).

> Workload sampling
> -----------------
>
>                 Key: CASSANDRA-8929
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8929
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>
> Workload *recording* looks to be unworkable (CASSANDRA-6572).  We could build 
> something almost as useful by sampling the requests sent to a node and 
> building a synthetic workload with the same characteristics using the same 
> (or anonymized) schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to