> On Mar 9, 2018, at 2:23 PM, Mike Drob <[email protected]> wrote:
>
> Trying to figure out what a sane way to test personality changes is.
>
> I have a patch on a project this adds a flag to the mvn execution to enable
> an additional profile.
>
> When Jenkins runs the precommit tests, I looked at the logs and both the
> branch and patch execution included that profile.
>
> I appreciate that there is a comment left at the start of the run that there
> is a change in the test environment, but it still seems strange that the
> branch and patch executions end up being identical. This makes a failure
> difficult to reason about.
>
> Are there downsides to running branch in the old environment and patch in the
> new?
As with most things in life, the answer is a “maybe”.
The “oh, the environment has changed!” code was designed with quite a
few things in mind.
The biggest one was that (at least from my experiences) the some of the
more common personality changes (e.g., adding a new test, module ordering, … )
absolutely required running the patched personality in both branch and patch
mode to make sure that the output still made sense. Did our new test actually
work? Did the diff actually work?
There are also some routines and variables in the personality that are
only ever called once. Waiting until the 2nd compile cycle means that some
parts will never get tested until commit.
Time is always a factor. In the early days, the restarting of
test-patch and the Docker launch were two separate events. When these were
merged into one event, it shaved quite a bit of “non-productive" time off and
removed quite a bit of code, but at the additional cost of having a chunk of
code that Yetus itself couldn’t test prior to commit. This was felt to be an
acceptable compromise given the number of times the Yetus code base actually
gets tested vs. users using Yetus to test theirs. :)
Additionally, personality changes are almost always made in contrast to
an existing run. This means people working on a personality always have the
ability to compare a previous run vs. their patched run.
Given all of that, the personality testing bit was lumped into the
other environment testing code to try and give the widest possible coverage.
Especially since comparing an older run vs. a newer run closes all(?) of the
remaining holes.
What should probably happen is a tool should be written to compare two
qbt runs. That would probably be the best way to test personalities. It
should be pretty trivial: strip out timestamps, compare the two output dirs
file by file.