You're completely right. The separation of performance tests and
correctness tests is one path forward. I think my only concern there is
that, in our past, these tests tend to be ignored and die.
I think the rest this is in the normal bucket of ITs is just because we
don't have rigor in your 4th point about perf evaluations.
Maybe, we could make some junit category to annotate such tests and make
them runnable via Maven, removing them from normal execution. I think
that would be an acceptable way forward.
However, that would leave us with no end-to-end test for ACCUMULO-3327
which isn't great..
Dylan Hutchison wrote:
Hi Josh,
Forgive me for the design question, but shouldn't we distinguish tests of
correctness from tests of performance? The following is my understanding of
test categories, which does not totally align with Accumulo's test suite:
* Unit tests test individual components.
* Integration tests test using components together. They may require more
resources such as starting an Accumulo (MAC or real).
* Examples are executable code separate from the above, that an outside
developer or user can read to see how Accumulo is used. Examples have their
own tests.
* Performance evaluations are executable code separate from the above. They
range in complexity from simple "test bulk imports" to RabdomWalk with
agitation.
If performance evaluations run separately, then developers can treat then
like benchmarks, comparing times to those on similar hardware or across
commits.
Could you remind me of the reasons why we keep performance tests in the
standard set of ITs?
On Aug 13, 2016 1:03 PM, "Josh Elser"<[email protected]> wrote:
I had assumed this test would pass locally (early-2013 MBP, 2.7 GHz Intel
Core i7, 16G ram), but nope! 38s and 45+ seconds on two runs.
Josh Elser wrote:
Hi,
I have some complaints about FastBulkImportIT (a test added with
https://issues.apache.org/jira/browse/ACCUMULO-3327) but no good ideas
for how to better test it. As it presently stands, it is a very
subjective test WRT the kind of hardware used to run it.
The test launches a 3-tserver MAC instance, creates about 585 splits on
a table, creates 100 files with ~1200 key-value pairs, and then waits
for the table to be balanced.
At this point, it imports these files into that table and fails if that
takes longer than 30s.
On my VPS (3core, 6G ram, "SSD"), the bulk import takes ~45 seconds.
This test will never pass on this node which bothers me because I am of
the opinion that anyone (with reasonable hardware) should be able to run
our tests (and to make sure it's clear: I believe this is reasonable
hardware).
Does anyone have any thoughts on how we could stabilize this test for
developers?
- Josh