Hey Zladivliba,

I'd like to use a SecureText function that filters any "potential risky" text 
input from my website.

This "function" depends on which risk you wish to prevent.

So what's "potential risky text" for me : everything that's not  : a-zA-Z0-9
;.,?+-_![]/()This clears any hack potential from XSS or SQL injection. This
limits the usability of my website, but I don't care, as long as xss are not
possible any more, and sql injections are out of the way it's perfectly viable
for what I want to do and the needs of my users.

If you want to create something that would be an escape sequence for both HTML
and SQL, you'll have a really difficult time in attempting a feat like this. The
best practices to avoid any XSS, XSRF, SQL Injection, or any other kind of
exploit would be to *VALIDATE* your input as you receive it from the user, and
*ESCAPE* all output to any stream.

What I mean by validating input: Make sure the data you are receiving from the
user is the type of data you would expect. For example, if you are expecting a
date pattern, then validate against a date pattern. If you are expecting an
alpha-numeric string, then validate against this.

What I mean by escaping all output is to ensure data is escaped properly
according to the rules of the stream to which you are writing. Writing up
something that is supposed to escape output as it goes to SQL, XML/HTML, CSS
and/or JS would be extremely tedious, not to mention you would probably be
limited to a set of word characters, with nothing special in between. This is
due to the numerous rules that are in place that define the characters which are
special to each language.

A good practice would be to escape your data to the appropriate stream. For
example, if you are writing up a MySQL query, use prepared statements and bind
your input to variables to ensure the data is properly escaped before performing
the query. PDO<http://php.net/pdo> and MySQLi<http://php.net/mysqli> are very
good at this. Aside from this point, Zend_Db and it's database adapters do well
to bind parameters to prepared statements, which abstracts you from the database
enough to not have to worry about this. If you must use a raw query to the
database, http://php.net/mysql_real_escape_string will be your best friend in
this instance.

When spitting data out to the browser, use $this->escape($data) in your views
and partials to ensure the data is escaped according to the rules of HTML. By
default $this->escape uses http://php.net/htmlEntities, which can be changed
quite easily to something else, should you need to escape for XML data.
As for JavaScript, use Zend_Json::encode() and Zend_Json::decode() to pass data
to and from the client/server to manipulate data that should be passed to
JavaScript.

A great blog I follow by this security officer, Mike Baily:
http://blog.skeptikal.org/
A lot of what I know and am preaching comes from him (not necessarily
ZF-related, but generic security related).

Now the question is how to build such a function. It seems that I can't use
preg_replace because of unicode characters.I've tried to use filter_var but I'm
not sure it filters also unicode characters. And I really want to strip
everything that's risky it's my main priority.
Cool guys on #Zftalk have advised to use pregReplace filter build in ZF since I
have a regexp, but I'm not sure regexp is secure so...
Any help appreciated ! I'm a little lost with this.

To build a single function would not be plausible, as stated above. It would
take a collection of functions to ensure this is done properly. I once created a
function long ago that was supposed to do this magic "filter and escape all."
However, I have since learned that such a function is not good practice because
you then end up with the wrong type of escape sequences for different types of
data.
http://pastie.org/941165
You may use this function, if you wish, however I advise against it for the
reasons stated in the above paragraph.

Using RegEx's to filter your data can be helpful, but I don't recommend it as a
"one size fit's all" scenario. There are still ways of bypassing the RegEx
filters. For example, if you allow the "e" modifier in your regular expressions
and use user-input for the replace pattern, that's a recipe for disaster.
http://www.murraypicton.com/2010/11/using-phps-preg_replace-with-the-e-modifier/

That's not to say that RegEx's are bad, but it should give a little wind of
caution against the 'e' modifier in RegEx's. Though, RegEx's are good for
validating against specific sets of data, such as phone numbers, eMail
addresses, and other specific types of strings.


One last bit of advice: If you are looking for something that will be 100%
secure, you will be searching for a very, very long time. The philosophy I've
learned in the realm of Information Security is that you can NEVER had anything
that is absolutely secure in every single manner. It's much easier to try and
hack/break something than it is to protect against exploit. There is no such
thing as something that is totally secure, as someone will always attempt to
bruteforce their way into the application until they've accomplished their goal
(given they've the time :P). You can only implement ways of protecting against
that threat in the hopes that you will drive them away.

Additionally, I second everybody's statement that you should not try to
re-invent the wheel :)

Hope this helps,
-Kizano
//-----
Information Security
eMail: [email protected]
http://www.markizano.net/

Reply via email to