TL;DR: recommendations are lettered bullets - see the last paragraph Here are my observations on how the teams I have worked on use graphs. My teams always use JMeter plugins, as that allows us to collect diagnostic as well as performance data. In addition, we generally use StatsD to generate and report additional performance-related attributes.
Our test rigs always run independently—we’ll kick off a group of load clients running JMeter separately with a shell script, then aggregate the results after the test. This minimizes network traffic not related to the test, thereby improving throughput and simplifying diagnostics when the test doesn’t perform well. The integration step also allows us to integrate the StatsD data stream into the JMeter data stream. During early test development, we use JMeter integrated graphics extensively—to gauge the impact of test strategies, to identify early performance problems, and to help identify appropriate load levels. Once an actual test is run, however, we have far more data than could easily be loaded into memory. I have developed tools which cull data to produce data sets that are small enough for various tools to ingest, and I use 3rd-party graphics tools (e.g. Tableau) and some text graphics generated with small custom programs, rather than JMeter, to produce results sets *and to export them for reporting*. Note that most test statistics are easy to produce without loading a lot of data into memory, *except sample median*. Sample median is one of the most useful statistics for round-trip times and latency, and one of the most difficult to produce with a very large data set. My 72-hour data sets tend to run several GB in text form; loading the whole data sit simply isn’t practical. I typically cull out a single variable plus elapsed time, and use primitive types in arrays to keep the RAM requirement down to something I can graph. One thing I always find myself creating from scratch is analysis plots. For example, I bin the round-trip response time to produce a frequency distribution. Same for latency. If you do the same for CPU and heap, you get a frequency distribution that provides a much more solid feel for actual CPU vs. time and heap vs. time than looking at raw data vs. time. And of course, load vs. results graphs (e.g. load vs. CPU, load vs. response time, etc.) is always done by hand or in Excel. As a user, I believe JMeter would benefit from: A. maintaining a real-time graphics capability. This is a “sales tool”—a way to show new users and management what the tool is really doing. B. some kind of post-test data aggregation tool set, so that data collected outside of JMeter can be integrated into the JMeter data set, and so that remote, disconnected instances can submit their data to a master test hub. C. a data analysis tool set, which works with data primarily on disk, and which can produce various statistical and summary graphics. I think a good topic for discussion would be “what do we need here”. D. an way to annotate and customize graphs, and then export them for publication.
