Re: Regression Testing

Petr Slabý Mon, 07 Jul 2014 00:09:07 -0700

Hi,
following is a description of what we are doing in our company.

With our software, we run regression tests after each nightly build andsometimes it is a tough fight. If there is a regression, it is not so easyto find which commit caused it, because there are potentially many betweenthe nightly builds. Then, the decision whether the change is wanted andexpected is in some cases also difficult (this part might be easier with PDFwhere there is the "golden standard" rendering in Acrobat). If the change isexpected and the new rendering "better" then one has to commit the newreference. This means that the files produced on the nightly build machinemust be available somehow - it is almost impossible to produce them locallyas the rendering results are slightly different with different versions ofjava and many other reasons. All this has to be done before the nextregression test is run to avoid that new regressions are hidden by earlierones. Our complete build with all tests runs several hours...


To improve this workflow, we now use the following schema in addition:
- there is a smaller set of regression tests which runs relatively fast

- these tests are triggered by each commit in formatting and renderingrelated projects- before running the test itself, the modified project(s) are compiledlocally, w/o publishing the result to maven

- the reference rendering files are stored in SVN

- if a test finds a regression, it immediately stores the new result as anew reference into SVN. This makes sure that a) the test renderings do notget lost and b) that each regression exactly points to the commit that hascaused it - the one that triggered the test. The failed test creates a newissue in JIRA with a pointer to SVN to the before and after rendering and abitmap of the differencies. The issue is then processed. If we find thechange to be expected then the issue is simply closed, otherwise we takeactions to fix the problem. The only annoying thing about this scheme isthat, after commiting the correction, the test runs again and reports aregression because it now compares to the faulty version of the rendering.


Best regards,
Petr.

-----Původní zpráva-----From: John Hewson

Sent: Friday, July 04, 2014 7:39 PM
To: dev@pdfbox.apache.org
Subject: Re: Regression Testing

Hi Tilman

Thanks for your thoughts, I think that your concerns are already covered bymy original proposal, I’ll try to explain why and how:

Of course I agree with the need for regression tests, however it isn'teasy: besides the problems of the different JDKs (I use JDK7 Windows 64bit), there is the problem that some enhancements create slight changes inrendering that are not errors, i.e. both the "before" and the "after"files look OK by itself. This has happened when we changed the textrendering recently, and has happened again when the clipping was improved.The cause are probably slight changes in color or in boundaries.

If a rendering has changed then the regression test should fail. When afailure occurs the developer needs to manually inspect the differences (wecould generate a visual diff which highlights what changed to make thiseasier) and if ok then they can replace the known-good PNG with the onesjust rendered. Indeed this will be the basic workflow for working withregression tests.

Copyrights is a problem: I'm testing mostly with JIRA attachments thatI've downloaded over the years. While uploading such files to JIRA mightcount as fair use, I doubt that this would still be true if they areincluded in a distribution. Instead, they should be stored somewhere onApache servers where only committers and build software ("Travis","Jenkins", ...) can access then. The public PDFs that Maruan mentionsdon't possibly have all the Problem cases that we solved before. However Ihave started working with these files and there are at least 5 recentissues that deals with them.

The PDFs won’t be in a distribution. They will just happen to be stored inan SVN repo but not our source code repo, in the same way that the websiteis stored in the “cmssite” branch of SVN or indeed, are on JIRA. The lawdoesn’t distinguish between JIRA and SVN, both are publicly available viaHTTP, so using SVN will simply be a continuation of what we’re already doingwith JIRA.

The crucial factor is that we’re only storing publicly available PDFs,because we have the right to do so, just like Google’s cache, and like wecurrently do with JIRA.

Additionally, the PDFs need to be version controlled otherwise we won’t beable to reliably recreate previous builds, so storing the files on a webserver won’t be practical. Also committers will frequently be updating therenderings as bugs are fixed and we’ll need to version-control the renderedPNG files for the same reason. Finally, having committers-only files doesn’tfit well with the Apache goal of open development and would be unnecessaryanyway given that all the PDFs are to be taken from public sources only.

In summary, I’m proposing that we just keep doing what we’re currently doingwith JIRA but we move it into its own SVN repo along with some pre-renderedPNGs.

Re preflight: the default mode should be to have the Isartor tests on.Individuals could still disable them locally, but the central buildsoftware should always use them.


Yes - does anybody know why this isn’t the default?

-- John

Re: Regression Testing

Reply via email to