Andreas J. Koenig wrote: >>>>>> On Mon, 10 Dec 2007 21:12:51 -0800, Michael G Schwern <[EMAIL >>>>>> PROTECTED]> said: > > > Adam Kennedy posed me a stumper on #toolchain tonight. In short, having a > > test which checks your signature doesn't appear to be an actual deterrent > to > > tampering. The man-in-the-middle can just delete the test, or just the > > SIGNATURE file since it's not required. So why ship a signature test? > > Asking the wrong question. None of our testsuites is there to protect > against spoof or attacks. That's simply not the goal. Same thing for > 00-signature.t
We would seem to be agreeing. If the goal of the test suite is not to protect against spoofing, and if it doesn't accomplish that anyway, why put a signature check in there? > > The only thing I can think of is to ensure the author that the signature > > they're about to ship is valid, but that's not something that needs to be > shipped. > > Has the world changed over night? Are we now questioning tests instead > of encouraging them? Do now suddenly authors have to justify their > testing efforts? > > I don't mind if we set up a few rules what tests should and should not > do, but then this topic needs to be put into perspective. > > > It appears that a combination of a CHECKSUMS check against another CPAN > mirror > > and a SIGNATURE check by a utility external to the code being checked is > > effective, and that's what the CPAN shell does. The CHECKSUMS check makes > > sure the distribution hasn't been tampered with. Checking against a CPAN > > mirror other than the one you downloaded the distribution from checks > that the > > mirror has not been compromised. Checking the SIGNATURE ensures that the > > module is from who you think its from. > > Yupp. And testing the signature in a test is better than not testing > it because a bug in a signature or in crypto software is as alarming > as a bug in perl or a module. I believe this to be outside the scope of a given module's tests. It's not the responsibility of every CPAN module to make sure that your crypto software is working. Or perl. Or the C compiler. Or make. That's the job of the toolchain modules which more directly use them (CPAN, Module::Signature, MakeMaker, Module::Build, etc...). [1] At some point you have to trust that the tools work, you can't test the whole universe. You simply don't have the time. That brings me to the central reason why we've started to examine tests for removal. There's a certain cost/benefit ratio to be considered. What's the cost of implementing and maintaining a test, what's the benefit and does the benefit justify the cost? What's the opportunity cost, could you be doing something more useful with that time and effort? Finally, what's the cost in terms of test suite confidence? How many false negatives are your users willing to endure before they lose confidence? The fixed cost of a test is in writing it. This includes both writing the test itself and possibly altering the code being tested to make it testable. It's a fixed cost because you do it once and then you're done. The reoccurring costs include diagnosing failures. The user loses time due to a halted installation. They contact the author who has to diagnose the failure and communicate back the results back to the user. If the test found a bug, then the cost has a benefit and it's worthwhile. But if the test failed because it's a bad test, or because of something out of the author's control and/or the user doesn't care about, then there's little or no benefit. Then there's the cost of confidence. Tests are only useful if someone pays attention to them. A failed test should be a clear indication of an actual problem. This is why "expected failures" (and their related "expected warnings") are so insidious. False failures erode the mental link between "test failure" and "bug". Get enough of them, and it doesn't take much, and people start to ignore any failure. This is one of the most dangerous social problems for a test suite. A test that results in a lot of false negatives has a high reoccurring cost to no benefit. Finally there's the question of opportunity cost. Instead of writing and maintaining a faulty test, what else could you have been doing with that time? Could you have been doing something with an even higher benefit? If so, you should do it instead. Let's look at the example of Test::More. The last release has 120 passes and just 4 failures. http://cpantesters.perl.org/show/Test-Simple.html#Test-Simple-0.74 What are those four failures? Three are due to a threading bug in certain vendor patched versions of perl, one is due to the broken signature test. Look at the previous gamma release, 0.72. 256 passes, 9 failures. 5 due to the threading bug, 4 from the signature test. 0.71: 73 passes, 2 failures. 1 signature, 1 threads 0.70: 221 passes, 12 failures. 3 signature, 9 threads And so on. That's nine months with nothing but false negatives. The signature test is not actually indicating a failure in Test::More, so it's of no benefit to me or the users, and the bug has already been reported to Module::Signature. The threading test is indicating a perl bug that's very difficult to detect [2], only seems to exist in vendor patched perls, I can't do anything about and is unlikely to effect anyone since there's so few threads users. It's already been reported to the various vendors but it'll clear up as soon as they stop mixing bleadperl patches into 5.8. In short, I'm paying for somebody else's known bugs. I get nothing. Test::More gets nothing. The tools get nothing. Cost with no benefit. So why am I incurring these costs? Maybe the individual users find out their tools are broken, but it's not my job to tell them that. I've kept in the threading test because the perl bug it's tickling does have a direct effect on Test::More, and it could indicate future threading issues, but lacking any way to resolve it I'm tempted to pull it. The signature test, otoh, does not indicate anything that effects Test::More. The ability or inability to check the signature has nothing to do with the operation of Test::More. So why am I checking it? The Test::More test suite isn't a full service gas station. It's not going to wash the windows and check the oil and give you directions. It makes sure Test::More works and that's that. As you can see, this is a considered analysis. In general, redundant tests are ok. They're often not truly redundant but just have a large overlap. And this all assumes the tests have some benefit. And often it's more trouble than it's worth to ferret out test redundancies. Worse yet is the creeping mental attitude of "should I write this test? Can I justify the cost?" This is like the related attitude of "can I justify cleaning up this code?" Unchecked, both lead to paralysis. But in this case it's a test which has no benefit to the module, [for the signature test] indicates no failure of module functionality, is reporting on known bugs in other tools and is a test that should be and is done by other modules more directly. Cost and redundancy with no benefit. [1] There are exceptions. For example, if you rely on a specific, questionable (possibly undocumented) feature you should test that it's still there to make diagnosing and debugging easier when it inevitably changes out from under you. [2] You have to run the test dozens or hundreds of times to get it to fail on an effected perl. -- We do what we must because we can. For the good of all of us, Except the ones who are dead. -- Jonathan Coulton, "Still Alive"