> Hi Tilman > > Thanks for your thoughts, I think that your concerns are already covered by > my original proposal, I’ll try to explain why and how: > >> Of course I agree with the need for regression tests, however it isn't easy: >> besides the problems of the different JDKs (I use JDK7 Windows 64 bit), >> there is the problem that some enhancements create slight changes in >> rendering that are not errors, i.e. both the "before" and the "after" files >> look OK by itself. This has happened when we changed the text rendering >> recently, and has happened again when the clipping was improved. The cause >> are probably slight changes in color or in boundaries. > > If a rendering has changed then the regression test should fail. When a > failure occurs the developer needs to manually inspect the differences (we > could generate a visual diff which highlights what changed to make this > easier) and if ok then they can replace the known-good PNG with the ones just > rendered. Indeed this will be the basic workflow for working with regression > tests. >
I think this is the only way to handle that situation. The same applies for text extraction etc. - If an improvement changes the results the ‚base‘ needs to be reset by adding the new image, text etc as the validation source. A basic testbed could also run against other JDKs - e.g. wo validating against the know-good files - so we pick up potential issues early. Should be easy with Jenkins and treated as a hint. >> Copyrights is a problem: I'm testing mostly with JIRA attachments that I've >> downloaded over the years. While uploading such files to JIRA might count as >> fair use, I doubt that this would still be true if they are included in a >> distribution. Instead, they should be stored somewhere on Apache servers >> where only committers and build software ("Travis", "Jenkins", ...) can >> access then. The public PDFs that Maruan mentions don't possibly have all >> the Problem cases that we solved before. However I have started working with >> these files and there are at least 5 recent issues that deals with them. > > The PDFs won’t be in a distribution. They will just happen to be stored in an > SVN repo but not our source code repo, in the same way that the website is > stored in the “cmssite” branch of SVN or indeed, are on JIRA. The law doesn’t > distinguish between JIRA and SVN, both are publicly available via HTTP, so > using SVN will simply be a continuation of what we’re already doing with JIRA. > > The crucial factor is that we’re only storing publicly available PDFs, > because we have the right to do so, just like Google’s cache, and like we > currently do with JIRA. > > Additionally, the PDFs need to be version controlled otherwise we won’t be > able to reliably recreate previous builds, so storing the files on a web > server won’t be practical. Also committers will frequently be updating the > renderings as bugs are fixed and we’ll need to version-control the rendered > PNG files for the same reason. Finally, having committers-only files doesn’t > fit well with the Apache goal of open development and would be unnecessary > anyway given that all the PDFs are to be taken from public sources only. > > In summary, I’m proposing that we just keep doing what we’re currently doing > with JIRA but we move it into its own SVN repo along with some pre-rendered > PNGs. In addition if we put in workarounds to handle nonconforming PDFs there should be a unit test added to make sure that we don’t break that e.g. when rewriting the parser. > >> Re preflight: the default mode should be to have the Isartor tests on. >> Individuals could still disable them locally, but the central build software >> should always use them. > > Yes - does anybody know why this isn’t the default? > No. +1 for enabling it per default > -- John