Re: Proposed (optional) kwalitee metric; use re 'taint' / Per-author tests?

2008-06-25 Thread David Cantrell
On Tue, Jun 24, 2008 at 07:08:00PM +1000, Paul Fenwick wrote:

 As the user of a module, it's possible for me to pass in tainted data.  The 
 module doesn't know from where it's been sourced.  However, unless the 
 *intent* of the module is to untaint this data, anything derived from that 
 data should probably remain tainted.

If you care about tainting, then I suggest that it's up to you to make
sure that your data doesn't get accidentally untainted.  The best way to
do this is to check it as early as possible and either untaint it
yourself or reject it.

 Yes, taint mode isn't an iron-clad guarantee of security, and if you don't 
 trust a module, don't use it.  However taint mode can be a useful safety 
 net, and for me it would be nice if more people were aware of it and how it 
 interacts with their code.

If you've turned taint-mode on in your code, then you're aware of what
it means and how it works, and certainly *should* be aware of how other
peoples' code might interact with that.  If you really want my code to
be taint-safe, then I would be delighted to accept a patch with tests,
and maybe even to give you a commit bit and make you a co-maintainer.

But I won't just accept blindly adding 'use re qw(taint)'.

   As a 
 completely off-the-bat suggestion that could be controlled by META.yml:
 
   cpants:
   disable:
   - has_test_pod_coverage
   - uses_no_re_taint
   - valid_gpg_siganture
 
   enable:
   - included_in_slackware
   - won_poetry_competition
   - includes_Tolkein_quote

Put it in a seperate file if you're going to have it at all, so that it
doesn't get overwritten by $release_tool_of_the_month.  And make
it really easy to disable whole swathes of tests without having to type
all their names.  In particular, I don't want to disable
includes tolkien quote and won poetry competition today, only to
have to add has funny true value at end of file of module tomorrow.

-- 
David Cantrell | Nth greatest programmer in the world

Immigration: making Britain great since AD43


Re: Proposed (optional) kwalitee metric; use re 'taint' / Per-author tests?

2008-06-24 Thread Paul Fenwick

G'day chromatic,

chromatic wrote:

I find it difficult to believe that you've audited all of my 
publicly-available code in the past 90 minutes to know exactly how much even 
uses regular expressions,


My original suggestion clearly should have come with some qualifications.

Obviously modules which don't use regular expressions at all aren't going to 
be untainting data by accident through regexp matches.  Under such a 
theoretical test, these modules can be excluded by default.


 let alone on data that could possibly come from tainted sources.

This has me confused.

As the user of a module, it's possible for me to pass in tainted data.  The 
module doesn't know from where it's been sourced.  However, unless the 
*intent* of the module is to untaint this data, anything derived from that 
data should probably remain tainted.  Likewise, unless it's the purpose of 
the module is untaint incoming data, anything the module reads from an 
external source should probably also remain tainted.


At the moment, the default behaviour of regular expressions makes is very 
easy to untaint by accident, which is unfortunate.  I even remember (a very 
long time ago) when CGI.pm used to untaint by accident, which was extremely 
unfortunate.  Most module authors don't even think about taint mode when 
writing their code, because most module authors don't which makes catching 
this sort of behaviour that much more difficult.


Yes, taint mode isn't an iron-clad guarantee of security, and if you don't 
trust a module, don't use it.  However taint mode can be a useful safety 
net, and for me it would be nice if more people were aware of it and how it 
interacts with their code.


Some of them are even *useful*.  I like those.  The useless distracty ones and 
the actively harmful ones -- not so much.


This is the core of the whole automated kwalitee metrics debate, isn't it? 
The metrics are subjective, one person's useful heuristic is another 
person's painful annoyance.


It's clear that my idea is going to easily fall into the painful annoyance 
category, so I'm withdrawing the proposal to add it as an optional CPANTS 
test for now.


On that note, surely we could save a lot of anguish with regards to many of 
the CPAN tests just by making the optional ones[1] actually optional?  As a 
completely off-the-bat suggestion that could be controlled by META.yml:


cpants:
disable:
- has_test_pod_coverage
- uses_no_re_taint
- valid_gpg_siganture

enable:
- included_in_slackware
- won_poetry_competition
- includes_Tolkein_quote

My apologies if this has already been discussed and I've missed it.

Cheerio,

Paul

[1] Or even all of them?

--
Paul Fenwick [EMAIL PROTECTED] | http://perltraining.com.au/
Director of Training   | Ph:  +61 3 9354 6001
Perl Training Australia| Fax: +61 3 9354 2681


Re: Proposed (optional) kwalitee metric; use re 'taint' / Per-author tests?

2008-06-24 Thread David Golden
On Tue, Jun 24, 2008 at 5:08 AM, Paul Fenwick [EMAIL PROTECTED] wrote:
 As the user of a module, it's possible for me to pass in tainted data.  The
 module doesn't know from where it's been sourced.  However, unless the
 *intent* of the module is to untaint this data, anything derived from that
 data should probably remain tainted.  Likewise, unless it's the purpose of
 the module is untaint incoming data, anything the module reads from an
 external source should probably also remain tainted.

I think I disagree with this. (Though perhaps could be argued out of
it.)  It seems to me that data should be validated at the time it is
collected and untainted once validated.  I don't see why some
subroutine N levels down the call stack in some utility module should
be expected to preserve taint on data you didn't check when you
received it.

-- David


Re: Proposed (optional) kwalitee metric; use re 'taint' / Per-author tests?

2008-06-24 Thread brian d foy
In article [EMAIL PROTECTED], Paul Fenwick
[EMAIL PROTECTED] wrote:

 On that note, surely we could save a lot of anguish with regards to many of 
 the CPAN tests just by making the optional ones[1] actually optional?  As a 
 completely off-the-bat suggestion that could be controlled by META.yml:

Why should I have to work to disable something I don't want and doesn't
apply to me? Why make me do something that distracts me from the real
issue of writing better modules and packaging distributions wisely?


Re: Proposed (optional) kwalitee metric; use re 'taint' / Per-author tests?

2008-06-24 Thread chromatic
On Tuesday 24 June 2008 02:08:00 Paul Fenwick wrote:

 As the user of a module, it's possible for me to pass in tainted data.  The
 module doesn't know from where it's been sourced.  However, unless the
 *intent* of the module is to untaint this data, anything derived from that
 data should probably remain tainted.  Likewise, unless it's the purpose of
 the module is untaint incoming data, anything the module reads from an
 external source should probably also remain tainted.

That's my point.  Most of the uses of regexes in most of the distributions 
I've written never act on data that could possibly be tainted (unless you 
somehow named subroutines or classes with tainted strings, in which case you 
have worse things to worry about than my code).

-- c