> Hi Tilman
> 
> Thanks for your thoughts, I think that your concerns are already covered by 
> my original proposal, I’ll try to explain why and how:
> 
>> Of course I agree with the need for regression tests, however it isn't easy: 
>> besides the problems of the different JDKs (I use JDK7 Windows 64 bit), 
>> there is the problem that some enhancements create slight changes in 
>> rendering that are not errors, i.e. both the "before" and the "after" files 
>> look OK by itself. This has happened when we changed the text rendering 
>> recently, and has happened again when the clipping was improved. The cause 
>> are probably slight changes in color or in boundaries.
> 
> If a rendering has changed then the regression test should fail. When a 
> failure occurs the developer needs to manually inspect the differences (we 
> could generate a visual diff which highlights what changed to make this 
> easier) and if ok then they can replace the known-good PNG with the ones just 
> rendered. Indeed this will be the basic workflow for working with regression 
> tests.
> 

I think this is the only way to handle that situation. The same applies for 
text extraction etc. - If an improvement changes the results the ‚base‘ needs 
to be reset by adding the new image, text etc as the validation source.

A basic testbed could also run against other JDKs - e.g. wo validating against 
the know-good files - so we pick up potential issues early. Should be easy with 
Jenkins and treated as a hint.  


>> Copyrights is a problem: I'm testing mostly with JIRA attachments that I've 
>> downloaded over the years. While uploading such files to JIRA might count as 
>> fair use, I doubt that this would still be true if they are included in a 
>> distribution. Instead, they should be stored somewhere on Apache servers 
>> where only committers and build software ("Travis", "Jenkins", ...) can 
>> access then. The public PDFs that Maruan mentions don't possibly have all 
>> the Problem cases that we solved before. However I have started working with 
>> these files and there are at least 5 recent issues that deals with them.
> 
> The PDFs won’t be in a distribution. They will just happen to be stored in an 
> SVN repo but not our source code repo, in the same way that the website is 
> stored in the “cmssite” branch of SVN or indeed, are on JIRA. The law doesn’t 
> distinguish between JIRA and SVN, both are publicly available via HTTP, so 
> using SVN will simply be a continuation of what we’re already doing with JIRA.
> 
> The crucial factor is that we’re only storing publicly available PDFs,  
> because we have the right to do so, just like Google’s cache, and like we 
> currently do with JIRA.
> 
> Additionally, the PDFs need to be version controlled otherwise we won’t be 
> able to reliably recreate previous builds, so storing the files on a web 
> server won’t be practical. Also committers will frequently be updating the 
> renderings as bugs are fixed and we’ll need to version-control the rendered 
> PNG files for the same reason. Finally, having committers-only files doesn’t 
> fit well with the Apache goal of open development and would be unnecessary 
> anyway given that all the PDFs are to be taken from public sources only.
> 
> In summary, I’m proposing that we just keep doing what we’re currently doing 
> with JIRA but we move it into its own SVN repo along with some pre-rendered 
> PNGs.

In addition if we put in workarounds to handle nonconforming PDFs there should 
be a unit test added to make sure that we don’t break that e.g. when rewriting 
the parser. 

> 
>> Re preflight: the default mode should be to have the Isartor tests on. 
>> Individuals could still disable them locally, but the central build software 
>> should always use them.
> 
> Yes - does anybody know why this isn’t the default?
> 

No.

+1 for enabling it per default


> -- John

Reply via email to