> Hi Tilman
>
> Thanks for your thoughts, I think that your concerns are already covered by
> my original proposal, I’ll try to explain why and how:
>
>> Of course I agree with the need for regression tests, however it isn't easy:
>> besides the problems of the different JDKs (I use JDK7 Windows 64 bit),
>> there is the problem that some enhancements create slight changes in
>> rendering that are not errors, i.e. both the "before" and the "after" files
>> look OK by itself. This has happened when we changed the text rendering
>> recently, and has happened again when the clipping was improved. The cause
>> are probably slight changes in color or in boundaries.
>
> If a rendering has changed then the regression test should fail. When a
> failure occurs the developer needs to manually inspect the differences (we
> could generate a visual diff which highlights what changed to make this
> easier) and if ok then they can replace the known-good PNG with the ones just
> rendered. Indeed this will be the basic workflow for working with regression
> tests.
>
I think this is the only way to handle that situation. The same applies for
text extraction etc. - If an improvement changes the results the ‚base‘ needs
to be reset by adding the new image, text etc as the validation source.
A basic testbed could also run against other JDKs - e.g. wo validating against
the know-good files - so we pick up potential issues early. Should be easy with
Jenkins and treated as a hint.
>> Copyrights is a problem: I'm testing mostly with JIRA attachments that I've
>> downloaded over the years. While uploading such files to JIRA might count as
>> fair use, I doubt that this would still be true if they are included in a
>> distribution. Instead, they should be stored somewhere on Apache servers
>> where only committers and build software ("Travis", "Jenkins", ...) can
>> access then. The public PDFs that Maruan mentions don't possibly have all
>> the Problem cases that we solved before. However I have started working with
>> these files and there are at least 5 recent issues that deals with them.
>
> The PDFs won’t be in a distribution. They will just happen to be stored in an
> SVN repo but not our source code repo, in the same way that the website is
> stored in the “cmssite” branch of SVN or indeed, are on JIRA. The law doesn’t
> distinguish between JIRA and SVN, both are publicly available via HTTP, so
> using SVN will simply be a continuation of what we’re already doing with JIRA.
>
> The crucial factor is that we’re only storing publicly available PDFs,
> because we have the right to do so, just like Google’s cache, and like we
> currently do with JIRA.
>
> Additionally, the PDFs need to be version controlled otherwise we won’t be
> able to reliably recreate previous builds, so storing the files on a web
> server won’t be practical. Also committers will frequently be updating the
> renderings as bugs are fixed and we’ll need to version-control the rendered
> PNG files for the same reason. Finally, having committers-only files doesn’t
> fit well with the Apache goal of open development and would be unnecessary
> anyway given that all the PDFs are to be taken from public sources only.
>
> In summary, I’m proposing that we just keep doing what we’re currently doing
> with JIRA but we move it into its own SVN repo along with some pre-rendered
> PNGs.
In addition if we put in workarounds to handle nonconforming PDFs there should
be a unit test added to make sure that we don’t break that e.g. when rewriting
the parser.
>
>> Re preflight: the default mode should be to have the Isartor tests on.
>> Individuals could still disable them locally, but the central build software
>> should always use them.
>
> Yes - does anybody know why this isn’t the default?
>
No.
+1 for enabling it per default
> -- John