Hi Gil, I would have to disagree as in this case I believe there is CO due to the threading model, CO on a per-thread basis as well as plain old omission. I believe these conditions are in addition to the conditions you're pointing to.
You may test at a fixed rate for HFT but in most worlds, random is necessary. Unfortunately that makes the problem more difficult to deal with. Regards, Kirk On 2013-10-18, at 5:32 PM, Gil Tene <g...@azulsystems.com> wrote: > I don't think the thread model is the core of the Coordinated Omission > problem. Unless we consider the only solution to be sending no more than one > request per 20 minutes from any given thread a threading model fix. It's more > of a configuration choice the way I see it, but a pretty impossible one. The > thread model may need work for other reasons, but CO is not one of them. > > In JMeter, as with all other synchronous testers, Coordinated Omission is a > per-thread issue. It's easy to demonstrate CO with JMeter with a single > client thread testing an application that has only a single client connection > in the real world, or with 15 client threads testing an application that has > exactly 15 real-world clients communicating at high rates (common with muxed > environments, messaging, ESBs, trading systems, etc.). No amount of threading > or concurrency will help get a better test results capturing for these very > real system. Any occurrence of CO will make the JMeter results seriously > bogus. > > When any one thread misses a planned request sending time, CO has already > occurred, and there is no way to avoid it at that point. You certainly detect > that CO has happened. The question is what to do about it in JMeter once you > detect it. The major options are: > > 1. Ignore it and keep working with the data as if it actually meant anything. > This amount to http://tinyurl.com/o46doqf . > > 2. You can try to change the tester behavior to avoid CO going forward. E.g. > you can try to adjust the number of threads up AND at the same time the > frequency of requests that each thread sends requests at, which will amount > to drastically changing the test plan in reaction to system behavior. In my > opinion, changing behavior dynamically will have very limited effectiveness > for two reasons: The first is that the problem had already occurred, so all > the data up to and including the observed CO is already bogus and has to be > thrown away unless it can be corrected somehow. Only after you auto-adjust > enough times to not see CO for a long time, your results during that time may > be valid. The second is that changing the test scenario is valid (and > possible) for very few real world systems. > > 3. You can try to correct for CO when you observe it. There are various ways > this can be done, and most of them will amount to re-creating missing test > sample results by projecting from past results. This can help correct the > results data set so that it would better approximate what a tester that was > not synchronous, and would have kept issuing requests per the actual test > plan, would have experienced in the test. > > 4. Something else we hadn't yet thought about. > > Some correction and detection example work can be found at: > https://github.com/OutlierCorrector/jmeter/commit/34c34cae673fd0871a423035a9f262d049f3d9e9 > , which uses code at https://github.com/OutlierCorrector/OutlierCorrector . > Michael Chmiel worked at Azul Systems over the summer on this problem, and > the OutlierCorrector package and the small patch to JMeter (under the > docs-2.9 branch) are some of the results of that work. This fix approach > appears to work well as long as no explicitly random behavior is stated in > the test scenarios (the outlier detector detects a test pattern and repeats > it in repairing the data. Expressly random scenarios will not exhibit a > detectable pattern.). > > -- Gil. > > On Oct 17, 2013, at 11:47 PM, Kirk Pepperdine <kirk.pepperd...@gmail.com> > wrote: > >> Hi Sebb, >> >> In my testing, the option off creating threads on demand instead of all at >> once has made a huge difference in my being able to control rate of arrivals >> on the server. It has convinced me that simply using the throughput >> controller isn't enough and that the threading model in JMeter *must* >> change. It is the threading model that is the biggest source of CO in >> JMeter. Unfortunately we weren't able to come to some way of a >> non-disruptive change in JMeter to make this happen. >> >> The model I was proposing would have JMeter generate an event heap sorted by >> the time when a sampler should be fired. A thread pool should be used to eat >> off of the heap and fire the events as per scheduled. This would allow >> JMeter to break the inappropriate relationship of a thread being a user. The >> solution is not perfect in that you will still have to fight with thread >> schedulers and hypervisors to get things to happen on queue. However, I >> believe the end result will be a far more scalable product that will require >> far fewer threads to produce far higher loads on the server. >> >> As for your idea on the using the throughput controller. IHMO triggering an >> assert only worsens the CO problem. In fact, if the response times from the >> timeouts are not added into the results, in other words they are omitted >> from the data set, you've only made the problem worse as you are filter out >> bad data points from the result sets making the results better than they >> should be. Peter Lawyer's (included here for the purpose of this discussion) >> technique for correcting CO is to simply recognize when the event should >> have been triggered and then start the timer for that event at that time. So >> the latency reported will include the time before event triggering. >> >> Gil Tene's done some work with JMeter. I'll leave it up to him to post what >> he's done. The interesting bit that he's created is HrdHistogram >> (https://github.com/giltene/HdrHistogram). It is not only a better way to >> report results,it offers techniques to calculate and correct for CO. Also >> Gil might be able to point you to a more recent version of his on CO talk. >> It might be nice to have a new sampler that incorporates this work. >> >> On a side note, I've got a Servlet filter that is JMX component that >> measures a bunch of stats from the servers POV. It's something that could be >> contributed as it could be used to help understand the source of CO.. if not >> just complement JMeter's view of latency. >> >> Regards, >> Kirk >> >> >> On 2013-10-18, at 12:27 AM, sebb <seb...@gmail.com> wrote: >> >>> It looks to be quite difficult to avoid the issue of Coordination >>> Omission without a major redesign of JMeter. >>> >>> However, it may be a lot easier to detect when the condition has occurred. >>> This would potentially allow the test settings to be changed to reduce >>> or eliminate the occurrences - e.g. by increasing the number of >>> threads or spreading the load across more JMeter instances. >>> >>> The Constant Throughput Controller calculates the desired wait time, >>> and if this is less than zero - i.e. a sample should already have been >>> generated - it could trigger the creation of a failed Assertion >>> showing the time difference. >>> >>> Would this be sufficient to detect all CO occurrences? >>> If not, what other metric needs to be checked? >>> >>> Even if it is not the only possible cause, would it be useful as a >>> starting point? >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@jmeter.apache.org >>> For additional commands, e-mail: user-h...@jmeter.apache.org >>> >> >