On Feb 12, 2008 11:43 AM,  <[EMAIL PROTECTED]> wrote:
> Quoting Jesse Becker <[EMAIL PROTECTED]>:
> > In the meantime, I started on a patch to put all of the variable
> > checks and sanitation in one place.
>
> Do you mean move the checks out of get_context.php and graph.php?

Yes, pretty much all of it.

> > brings up the question:  are we "checking" or "cleaning?"
>
> Good point.  clean_string() will actually change data, escaping
> unwanted HTML which might be present (cleaning).  clean_number() will
> return NULL if the input value is not numeric (checking).

I see the potential for 2 groups of functions:  is_valid_type() and
clean_type().  The is_valid* set are, for the most part, already part
of the PHP core in the form of is_numeric, is_bool, etc.  If we want
our own wrappers around them for the sake of consistency, that's not
difficult.  However, there are some datatypes that require manual
validation:  the check for valid hexcolors and image sizes come to
mind.   These all return true or false, and do not change the data.
Cleaning is, IMO, an inherently different operation, and implies
making changes to the data.

I've traced a few of the display and  "undefined index" problems to
strings that return NULL, with no other warning.

I also have things setup so that any data that is not explicitly
defined is *thrown away* during the sanitation process.  This is to
keep something from sneaking in unchecked, either by mistake or
malintent.

I wish PHP had taint checking.

> I agree the 'check' approach is less error prone.  If input seems to
> be malformed, it's a bad idea to try to guess what the user intended.
> Perhaps we ought to add some logic after the input validations whereby
> if any validations failed, the script exits with an error message.

There will be plenty of errors thrown in case of invalid data.  I'm
inherently distrustful of all data that isn't explicitly set; this is
especially true for data coming from URL query-strings.

> Other thoughts on validations and filtering:
>
> There are some input values (like $sort) that currently have
> escapeshellcmd() run on them, but are never used in a shell command.
> We could save a bit by only running escapeshellcmd() when it's
> actually needed, just before the shell call to rrdtool.  For most of
> these, escapeshellarg() would be more appropriate, since they are
> arguments not actual commands.

Sure.  It is somewhat simpler to just run everything through though,
instead of having to distinguish which variables are used for what.

> Putting all user-supplied input into some type of container, like an
> array called $user, would add clarity to the code.  Wherever this

I'll think about this.  Right now, I'm checking the variables in-place
($_GET and $_COOKIE), and throwing away ones that are invalid or
unknown.  Since so much of the code pulls directly from $_GET, this
makes for fewer changes elsewhere in the code.  However, if you used
$_GET directly, it's pretty obvious where it came from. :)

> input is used, it's obvious where the value originally came from.
> We'd need to decide if this stores the raw input, a filtered/validated
> value, or both.  I'd favor doing the same thing for all values set in
> conf.php: put them in a $conf array.  When reading through a script,
> this makes it much easier to identify the original source of whatever
> value is being used.

I like the idea of $conf[], and use that concept a lot of my own programs.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to