On 04/10/2014 04:34 AM, Kirill Müller wrote:

On 03/26/2014 06:46 PM, Paul Gilbert wrote:


On 03/26/2014 04:58 AM, Kirill Müller wrote:
Dear list


It is possible to store expected output for tests and examples. From the
manual: "If tests has a subdirectory Examples containing a file
pkg-Ex.Rout.save, this is compared to the output file for running the
examples when the latter are checked." And, earlier (written in the
context of test output, but apparently applies here as well): "...,
these two are compared, with differences being reported but not causing
an error."

I think a NOTE would be appropriate here, in order to be able to detect
this by only looking at the summary. Is there a reason for not flagging
differences here?

The problem is that differences occur too often because this is a
comparison of characters in the output files (a diff). Any output that
is affected by locale, node name or Internet downloads, time, host, or
OS, is likely to cause a difference. Also, if you print results to a
high precision you will get differences on different systems,
depending on OS, 32 vs 64 bit, numerical libraries, etc. A better test
strategy when it is numerical results that you want to compare is to
do a numerical comparison and throw an error if the result is not
good, something like

  r <- result from your function
  rGood <- known good value
  fuzz <- 1e-12  #tolerance

  if (fuzz < max(abs(r - rGood))) stop('Test xxx failed.')

It is more work to set up, but the maintenance will be less,
especially when you consider that your tests need to run on different
OSes on CRAN.

You can also use try() and catch error codes if you want to check those.


Thanks for your input.

To me, this is a different kind of test,

Yes, if you meant that you intended to compare character output, it is a different kind of test. With a file in the tests/ directory of a package you can construct a test of character differences in individual commands with something like

  z1 <- as.character(rnorm(5))
  z2 <- as.character(type.convert(z1))
  if(any(z1 != z2)) stop("character differences exist.")

for which no one would be required to make any changes to the existing package checking system. One caveat is output that is done as a side effect. For longer output streams from multiple commands you might construct your own testing with R CMD Rdiff.

As you point out, adding something to flag different levels of severity for differences from a .Rout.save file would require some work by someone.

HTH,
Paul

for which I'd rather use the
facilities provided by the testthat package. Imagine a function that
operates on, say, strings, vectors, or data frames, and that is expected
to produce completely identical results on all platforms -- here, a
character-by-character comparison of the output is appropriate, and I'd
rather see a WARNING or ERROR if something fails.

Perhaps this functionality can be provided by external packages like
roxygen and testthat: roxygen could create the "good" output (if asked
for) and set up a testthat test that compares the example run with the
"good" output. This would duplicate part of the work already done by
base R; the duplication could be avoided if there was a way to specify
the severity of a character-level difference between output and expected
output, perhaps by means of an .Rout.cfg file in DCF format:

OnDifference: mute|note|warning|error
Normalize: [R expression]
Fuzziness: [number of different lines that are tolerated]

On that note: Is there a convenient way to create the .Rout.save files
in base R? By "convenient" I mean a single function call, not checking
and manually copying as suggested here:
https://stat.ethz.ch/pipermail/r-help/2004-November/060310.html .


Cheers

Kirill

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to