[
https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073141#comment-14073141
]
Lyuben Todorov commented on CASSANDRA-6572:
-------------------------------------------
bq. It looks to me like you need some way to share the statement preparation
across threads, as it can be used by any thread (and across log segments) once
prepared. Probably easiest to do it during parsing of the log file
Seems simple enough, creating a concurrent map that is shared across a
WorkloadReplayer should do the job. The problem posed with doing it whilst
parsing the log is that the statement might be for a ks / cf that isn't yet
created
bq. We also have an issue with replay potentially over-parallelizing, and also
potentially OOMing, as you're submitting straight to a thread pool after
parsing each file. So there's nothing stopping us racing ahead and reading all
of the log files (you have an unbounded queue)
Possible solution is to move the multimap at the class level rather than having
{{WP#read}} creating one each time it's called (again per WorkloadReplayer
which is fine since we should only have 1 per replay). Then every time a read
is completed we submit the collection of {{QuerylogSegments}} to be replayed,
empty the map and populate it again if the same thread-id is met in
{{WP#read}}. The tricky part is submitting the same thread-id only once we know
the executor doesn't have a task with the same thread-id already running.
bq. Also, we're still replaying based on offset from last query, which means we
will skew very quickly. We should be fixing an epoch (in nanos) such that you
have a log epoch of L, and queries are run at T=L+X; when re-run we have a
replay epoch of R, and we run queries at R+X
It's on the todo list.
> Workload recording / playback
> -----------------------------
>
> Key: CASSANDRA-6572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
> Project: Cassandra
> Issue Type: New Feature
> Components: Core, Tools
> Reporter: Jonathan Ellis
> Assignee: Lyuben Todorov
> Fix For: 2.1.1
>
> Attachments: 6572-trunk.diff
>
>
> "Write sample mode" gets us part way to testing new versions against a real
> world workload, but we need an easy way to test the query side as well.
--
This message was sent by Atlassian JIRA
(v6.2#6252)