Steve Fink wrote (and I edited slightly):
> <groan> I can't figure out why so many people misinterpret my RFC12
> as requiring a solution to the halting problem.
a large class of incompletely expressed
suggestions appear to get grouped into
"This requires solving the halting problem!"
without providing further explanation.
Looking at your RFC12, I imagine the problem is, that it is impossible
to know at compile time if a variable is going to get used or not. When
exactly is the warning supposed to be generated?
I got the
same response to a misinterpretation / sugestion_for_reimplemtnation of
tainting (based on the theory of propagate/carry adders) which would
classify every function as
taint-generating
taint-stopping
or taint-propagating
and also, on a second axis:
taint-vulnerable (which moots its status as a generator/propagator)
or taint-invulnerable.
in order to:
immediately identify unsafe practices at compile time.
identify "safe zones" in which dynamic taint checking could be turned off with
zero semantic impact
For instance,
sub one{ system shift }
is taint-vulnerable and taint-generating
sub extractip{ ()=$_[0] =~ m/(\d+\.\d+\.\d+\.\d+)/ }
is taint-invulnerable and taint-stopping
sub readaline(){ my $s ;return $s=<STDIN>}
is taint-invulnerable and taint-generating
sub double {return $_[0].$_[0]}
is taint-invulnerable and taint-propagating.
sub detaint($) { # prepare file names for reading as inputs
$_[0] =~ m{([\w/]+)} and return $1;
print "Yucky file name, try again?";
detaint readaline
}
is an invulnerable stopper.
Recognizing recursive cases is not
a problem because they pass through the status of the functions which
are referred, which can only come from absolute results.
All recursive algorithms eventually decompose into their base cases, and
by using that fact we can ignore recursive cases, or take the worst-case
of functions called in returned expressions.
sub r2detaint($) { # poorly prepare file names for reading as inputs
$_[0] =~ m{([\w/]+)} and return $1;
print "Yucky file name, try again?";
$newline = readaline; #tainted
$newline gt 'A' ? reverse shift : r2detaint $newline;
}
is a propagator, since of the three possible returned expressions:
untainted, fresh from the match
reverse of the input
recursive
the worst case is propagation.
The existence of "eval" does not require that the halting problem
get solved, either: "eval-string" is vulnerable, "eval-block" is
a propagator.
Unknown and external functions must be considered vulnerable generators;
A datum can then be checked _at compile time_ for safeness, so that only
possibly-tainted data need to have their taint bits checked.
sub copyfromnamedsources{
# literal definition is a taint-stopper
my $infilename='timestamp_and_introduction|';
# "open" is vulnerable, so taint-check $outfilename
open OUT, $outfilename;
# $infilename is known AT COMPILE TIME to be safe
# so dynamic taintchecking it can be optimized out
open IN, $infilename;
print OUT, (<IN>);#print is an invulnerable stopper
# assignment to lexical from taint-generator
$infilename = readaline;
# open is taint-vulnerable
#open IN, $infilename; # this would be a compile-time error
open IN, detaint($infilename); # detaint is a stopper
print OUT, (<IN>);
open IN, shift; # this will still require dynamic checking.
print OUT, (<IN>);
};
Anyone disagree with the above analysis?
My early naive claim "Taint-checking can be done at compile time!" is too strong,
as solving the last one -- do we need to taint-check data derived from function params?
is not possible w/o running the program.
A weaker claim "Much taint-checking can be done at compile time!" may be true,
providing a faster taint mode. I do not know how much taint-checking is currently
done how; the documentation seems to indicate that taint status is kept on
all data, and a check performed before evaluating any vulnerable expression.
--
David Nicol 816.235.1187 [EMAIL PROTECTED]
I don't watch TV, I have no telephone, and I vote