Re: [PHP] Cleaning user data

2003-03-20 Thread rotsky
That's useful stuff, thanks - and thanks to other respondents.

My main concerns are to avoid junk in the database (and on-screen messages)
and to avoid dangerous and malicious postings, like the one Justin outlined
below (so I guess strip_tags is a major step there). What I have in mind,
then, is:

Use a foreach loop to run through all posted data and perform the following
on each item:
- strip_tags()
- trim()

I'll be saving this stuff to a database, so I'll keep htmlentities for the
display stage.

Also, on a field-by-field basis (depending on what it holds):
- check not empty
- check length
- check against allowable characters  formats

I'm still battling with the whole escaped characters business. My hosting
supplier has magic quotes turned on, so on the page that receives the data
from the form, I run the $_POST variables through stripslashes(). And yet
the slashes are still there - eg, in front of apostrophes. Perhaps they've
been escaped twice for some reason. I take them out because the data is
going to be POSTed again before being written to the database. I guess I
need to experiment more.

Justin French [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
snip
 BTW: Allowing some tags with striptags() offers are great security risk:

 let's say you allow b tags -- then I can go:

 b onmouseover'javascript:window.close();'hahahaha/b  --  not good!!
snip



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-20 Thread CPT John W. Holmes
 --- John W. Holmes [EMAIL PROTECTED] wrote:
  I disagree. I think stripping HTML from my text is a horrible thing. If
  I want to put a b in my text, then use htmlentities() and show me a
  b when I look at it. Obviously you don't want to evaluate HTML, but
  the end result should be that I should see exactly what I typed into the
  text box.

The real problem I have with strip_tags is that if I want to type smile or
grin, it's going to be stripped out and now I have to go back and edit my
code and change it to something else... If you just use htmlentities(), the
user is none the wiser.

  If you need to allow formatted text, then use something like BBcode
  where you can specify exactly what is allowed.

 Maybe there is something I'm missing, but I have always hated these
alternative
 markup languages like BBcode that seem to offer no benefit over HTML. If
you
 want to allow the b tag to be evaluated, you can do something like this
after
 you use htmlentities():

 $blah = str_replace('lt;bgt;', 'b', $blah);
 $blah = str_replace('lt;/bgt;', '/b', $blah);

 Of course, if people want the b to appear exactly as they type it, they
would
 either have to use lt;bgt;, or you would have to let them choose an
option as
 to whether they want to use HTML (much like slash code does).

That would work, too, I guess. If the user actually typed in lt; it would
be encoded as amplt; and not match something similar to a replacement like
you've shown.

You don't want to do matching like you've shown, though. If I put a b on
my page with no /b, then it's going to make everything on the entire page
following my post bold. When cleaning the data, you want to make sure you
match a pattern that includes both the start and end tag. You can use
regular expressions or go through character by character.

---John Holmes...


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-20 Thread Chris Shiflett
--- CPT John W. Holmes [EMAIL PROTECTED] wrote:
 You don't want to do matching like you've shown, though. If I put a b on
 my page with no /b, then it's going to make everything on the entire page
 following my post bold.

Well, my example was simplified. If the user's data is contained in a table
cell, they would only screw up the formatting of their own post. For tags that
you really need to make sure are closed, you can check for that prior to making
any replacements.

I still fail to see how BB code helps in any way, since you have to make these
same considerations. But, like I said, maybe I'm missing something. :-)

Chris

=
Become a better Web developer with the HTTP Developer's Handbook
http://httphandbook.org/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-20 Thread Leif K-Brooks
My BBCode class takes care of unended tags, and much more.  Have a look 
at http://www.phpclasses.org/browse.html/package/951.html.

Chris Shiflett wrote:

I still fail to see how BB code helps in any way, since you have to make these
same considerations. But, like I said, maybe I'm missing something. :-)
--
The above message is encrypted with double rot13 encoding.  Any unauthorized attempt 
to decrypt it will be prosecuted to the full extent of the law.



Re: [PHP] Cleaning user data

2003-03-20 Thread Chris Shiflett
--- Leif K-Brooks [EMAIL PROTECTED] wrote:
 My BBCode class takes care of unended tags, and much more.  Have a look 
 at http://www.phpclasses.org/browse.html/package/951.html.

You have to log in to view any source on that site (or so it seems), so no
thanks.

Unended tags are easy enough to handle with HTML, so what benefit does BBcode
offer in that regard? The only conversation regarding BBcode in this thread was
to question its purpose, since it seems to have none.

Chris

=
Become a better Web developer with the HTTP Developer's Handbook
http://httphandbook.org/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Cleaning user data

2003-03-20 Thread John W. Holmes
 I still fail to see how BB code helps in any way, since you have to
make
 these
 same considerations. But, like I said, maybe I'm missing something.
:-)

I agree pretty much. The only way it helps is that it's easier for
people to pick up, however slightly. Instead of explaining to people to
use a href, you tell them to just use [url]. It's slightly easier, but
doesn't offer any additional features...

---John W. Holmes...

PHP Architect - A monthly magazine for PHP Professionals. Get your copy
today. http://www.phparch.com/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Cleaning user data

2003-03-19 Thread rotsky
I'd like to canvas opinions about what's needed to clean user input. I'm
using an HTML form where users enter simple things like name and phone
number, but also a couple of small text areas for address and a message (up
to 50 words or so).

How would people recommend cleaning this data when it's received (via
$_POST) in the next page? Some fields (like email) I can check against a
template using ereg(), but the text areas pose more of a problem. I assume
running strip_tags() might be a wise precaution, and maybe also
htmlentities(). Anything else?

I'd be interested to hear what other people do.

a+
Steve



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-19 Thread Pete James
It really depends on what you what to do with the data.

For instance, if you want to insert into a database, you'll want to run 
addslashes() on it, or some other such quoting.

If you wnat to use the data as a forum post or comment, etc, you'll want 
to strip the html out of it with strip_tags() or htmlentities() like you 
mentioned

If you want to use the data in a command-line, you should run 
escapeshellarg() or escapeshellcmd().

If you want to send an email to this person later based on the email 
address they're providing, you may want to use checkdnsrr and a solid 
regex to make sure that this email is reasonably valid.

There are any number ways to check a piece of user-submitted data.  You 
have to evaluate what it is you want to do with it, and at every stage 
make every effort to ensure that it is what you think it is.

There is no such thing as safe data, just less-dangerous data.

HTH.
Pete.
rotsky wrote:
I'd like to canvas opinions about what's needed to clean user input. I'm
using an HTML form where users enter simple things like name and phone
number, but also a couple of small text areas for address and a message (up
to 50 words or so).
How would people recommend cleaning this data when it's received (via
$_POST) in the next page? Some fields (like email) I can check against a
template using ereg(), but the text areas pose more of a problem. I assume
running strip_tags() might be a wise precaution, and maybe also
htmlentities(). Anything else?
I'd be interested to hear what other people do.

a+
Steve




--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


RE: [PHP] Cleaning user data

2003-03-19 Thread John W. Holmes
 I'd like to canvas opinions about what's needed to clean user input.
I'm
 using an HTML form where users enter simple things like name and phone
 number, but also a couple of small text areas for address and a
message
 (up
 to 50 words or so).
 
 How would people recommend cleaning this data when it's received (via
 $_POST) in the next page? Some fields (like email) I can check against
a
 template using ereg(), but the text areas pose more of a problem. I
assume
 running strip_tags() might be a wise precaution, and maybe also
 htmlentities(). Anything else?

For a textarea, apply htmlentities() before you save it in the database.
This will let you safely display it on a web page and re-display it in
another textarea for further editing.

If you need to use the data in an email or file, then only apply
htmlentities() when you display the data on a web page, not when you
save it in the database.

Bottom line, as you hopefully know, VALIDATE EVERYTHING!

---John W. Holmes...

PHP Architect - A monthly magazine for PHP Professionals. Get your copy
today. http://www.phparch.com/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-19 Thread olinux
You can also use basic functions like is_numeric() [to
make sure the value is numeric - duh] or a custom
function to do something like check for a valid email
address format.

I have a news site that explodes the URL to get values
for the directory/article it is supposed to display.
since the types of articles are limited, I just use an
array of these values and check that the piece that I
have matches one of them. 

URL example /news/php/123.htm

$article_types = array(php, javascript, perl);

$url_array=explode(/,$_SERVER['REQUEST_URI']); 
//BREAK UP THE URL PATH USING '/' as delimiter 
$article_type = $url_array[2];  // php
$article_id   = str_replace('.htm','',$url_array[3]);
// 123

if ( (in_array($article_type, $article_types)) 
is_numeric($article_id) )
{
   ... query for article and display ...
}
else
{
   ... display 404 error ...
}



 rotsky wrote:
  I'd like to canvas opinions about what's needed to
 clean user input. I'm
  using an HTML form where users enter simple things
 like name and phone
  number, but also a couple of small text areas for
 address and a message (up
  to 50 words or so).
  
  How would people recommend cleaning this data when
 it's received (via
  $_POST) in the next page? Some fields (like email)
 I can check against a
  template using ereg(), but the text areas pose
 more of a problem. I assume
  running strip_tags() might be a wise precaution,
 and maybe also
  htmlentities(). Anything else?
  
  I'd be interested to hear what other people do.
  
  a+
  Steve
  
  
  
 
 
 
 -- 
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php
 


__
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-19 Thread Justin French
The first rule is to NEVER rely on anything that they give you, or any of
the security precautions in your form code, because someone can always creat
a less-secure form which posts to the same script.

So, whilst maxlength='4' for a year select thing is great, you should check
at the other end that

a) it is only four digits
b) it is_numeric()


TEXTAREA's don't even have a max length from memory, so if you want to limit
to n characters, that's easy using strlen() to check it, or substr() to chop
it.

For 50 words (as per your OP), you'd can check it with :

?
$words = explode(' ', $_POST['about_me']);
if(count($words)  50)
{
// error
}
else {
// good
}
?

or chop it with

?
$text = $_POST['about_me'];
$words = explode(' ', $text);
if(count($words) = 50)
{
$text = '';
while($i=0;$i=50,$i++)
{
$text .= {$v} ;
}
$text .= ... [too long];
}
echo $text;
?


Untested, season to taste.


And yes, definitely striptags(), and follow the advice on the rest of the
thread.


BTW: Allowing some tags with striptags() offers are great security risk:

let's say you allow b tags -- then I can go:

b onmouseover'javascript:window.close();'hahahaha/b  --  not good!!


Justin


on 20/03/03 11:18 AM, rotsky ([EMAIL PROTECTED]) wrote:

 I'd like to canvas opinions about what's needed to clean user input. I'm
 using an HTML form where users enter simple things like name and phone
 number, but also a couple of small text areas for address and a message (up
 to 50 words or so).
 
 How would people recommend cleaning this data when it's received (via
 $_POST) in the next page? Some fields (like email) I can check against a
 template using ereg(), but the text areas pose more of a problem. I assume
 running strip_tags() might be a wise precaution, and maybe also
 htmlentities(). Anything else?
 
 I'd be interested to hear what other people do.
 
 a+
 Steve
 
 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Cleaning user data

2003-03-19 Thread John W. Holmes
 And yes, definitely striptags(), and follow the advice on the rest of
the
 thread.

I disagree. I think stripping HTML from my text is a horrible thing. If
I want to put a b in my text, then use htmlentities() and show me a
b when I look at it. Obviously you don't want to evaluate HTML, but
the end result should be that I should see exactly what I typed into the
text box. 

If you need to allow formatted text, then use something like BBcode
where you can specify exactly what is allowed. 

---John W. Holmes...

PHP Architect - A monthly magazine for PHP Professionals. Get your copy
today. http://www.phparch.com/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning user data

2003-03-19 Thread Justin French
on 20/03/03 3:53 PM, John W. Holmes ([EMAIL PROTECTED]) wrote:

 And yes, definitely striptags(), and follow the advice on the rest of
 the
 thread.
 
 I disagree. I think stripping HTML from my text is a horrible thing. If
 I want to put a b in my text, then use htmlentities() and show me a
 b when I look at it. Obviously you don't want to evaluate HTML, but
 the end result should be that I should see exactly what I typed into the
 text box. 

Depends if you want to allow formatting... I don't :)

I also haven't had the need to *display* HTML on any of my sites, so
stripping tags is what *I* do.


 If you need to allow formatted text, then use something like BBcode
 where you can specify exactly what is allowed.

Yes.


Justin


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Cleaning user data

2003-03-19 Thread Chris Shiflett
--- John W. Holmes [EMAIL PROTECTED] wrote:
 I disagree. I think stripping HTML from my text is a horrible thing. If
 I want to put a b in my text, then use htmlentities() and show me a
 b when I look at it. Obviously you don't want to evaluate HTML, but
 the end result should be that I should see exactly what I typed into the
 text box.

Excellent point.

 If you need to allow formatted text, then use something like BBcode
 where you can specify exactly what is allowed.

Maybe there is something I'm missing, but I have always hated these alternative
markup languages like BBcode that seem to offer no benefit over HTML. If you
want to allow the b tag to be evaluated, you can do something like this after
you use htmlentities():

$blah = str_replace('lt;bgt;', 'b', $blah);
$blah = str_replace('lt;/bgt;', '/b', $blah);

Of course, if people want the b to appear exactly as they type it, they would
either have to use lt;bgt;, or you would have to let them choose an option as
to whether they want to use HTML (much like slash code does).

Chris

=
Become a better Web developer with the HTTP Developer's Handbook
http://httphandbook.org/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php