[
https://issues.apache.org/jira/browse/CASSANDRA-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571162#comment-16571162
]
Chris Lohfink commented on CASSANDRA-14436:
-------------------------------------------
While I think we can do something like creating a concurrent set of Samplers
for each SamplerType that we tie to a Sampler session and flag it to start at
same time I dont think its necessary. The current use of top partitions has
never had a reported issue with people trying to concurrently run profiling
sessions so it can be a new feature to add in another ticket at sometime but I
dont think its needed enough here.
In meantime I added a strict restriction on a single at a time, raising an
exception if someone tries to kick off a 2nd one. Also the sampling will
timeout at the end of the duration so if the finish is never called it wont
spin forever.
I did write some basic jmh benchmarks but i didnt want to make insert()
accessible and the {{.*microbench.*}} in build.xml makes default visibility not
an option so... yeah. Ultimately (when on) its just ThreadExecuterPool.submit()
on the addSample in read/write path which is pretty straight forward limitation
on the contention on the queue but i saw 100-300nanosecond -ish. Going into the
actual guts, the frequency sampler being a wrapper around the addthis
StreamSummary - which there might be something better out there now but its
seemed to do fine so far. In some worst case JMH benchmarks I was able to see
this hit 3us or so, which could conceivably underperform writes which would
cause a backup. The MaxSampler uses MinMaxPriorityQueue, which after
PriorityQueue(comparator) becomes available (post java8) that can be replaced
to be more performant, but that rarely breaks a microsecond even with top 1024.
Just incase as a catchall I added the same as the trace executor - a throwaway
loadshedding incase the sampler executor does get backed up. This includes some
plumbing so its reported appropriately in metrics.
> Add sampler for query time and expose with nodetool
> ---------------------------------------------------
>
> Key: CASSANDRA-14436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14436
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Chris Lohfink
> Assignee: Chris Lohfink
> Priority: Major
>
> Create a new {{nodetool profileload}} that functions just like toppartitions
> but with more data, returning the slowest local reads and writes on the host
> during a given duration and highest frequency touched partitions (same as
> {{nodetool toppartitions}}). Refactor included to extend use of the sampler
> for uses outside of top frequency (max instead of total sample values).
> Future work to this is to include top cpu and allocations by query and
> possibly tasks/cpu/allocations by stage during time window.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]