Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

2008-06-08 Thread Daniel McBrearty
thanks! I've wondered about that. I'd like to see a formal definition
of what does get through that filter. Hard to test it on, say,
Mandarin CHinese when you don't know the language yourself. But I've a
decent database of existing content, I could see if any of that fails
this test.

Thanks!


 [\w]+

 works pretty well iirc, if your locale includes UTF-8.


___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

2008-06-07 Thread Daniel McBrearty
of course. But how do you regex an inclusive list for any character in
any human language ?


On Fri, Jun 6, 2008 at 7:46 PM, Mesdaq, Ali [EMAIL PROTECTED] wrote:
 No escape sequence should get through if you reject any characters
 outside of the allowed characters. For example you could just reject the
 input and prompt for another input if this regex matches
 (?:[^a-zA-Z0-9 _]+)
 So escape sequences shouldn't affect this test.

 Thanks,
 --
 Ali Mesdaq (CISSP, GIAC-GREM)
 Security Researcher II
 Websense Security Labs
 http://www.WebsenseSecurityLabs.com
 --

 -Original Message-
 From: Daniel McBrearty [mailto:[EMAIL PROTECTED]
 Sent: Thursday, June 05, 2008 11:07 PM
 To: The elegant MVC web framework
 Cc: DBIx::Class user and developer list
 Subject: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

 Thanks for the suggestions. Indeed, specifying a list of chars which is
 clean (e.g. [a-zA-Z0-9_] for a username in English) is optimum, and I
 prefer that. But when you are working with fully multilingual material,
 this becomes pretty much impossible. As the site in question is all
 about language learning and could eventually handle any language, that
 is the issue.

 Rejecting some of the suspicious chars you suggest is something I will
 do - but even that is not foolproof as there are various ways (more than
 one, IIRC, but I'm not sure what they all are) of using escape sequences
 to get through.

 Of the list you suggest, I'd need to keep (, ), ? - all the rest I could
 kill quite happily.

 Again, thanks for the input. I'm going to forward this to the
 DBIx::Class list (as that is probably where it should have gone in the
 first place).

 ___
 List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
 IRC: irc.perl.org#dbix-class
 SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
 Searchable Archive:
 http://www.grokbase.com/group/[EMAIL PROTECTED]





  Protected by Websense Messaging Security -- www.websense.com

 ___
 List: Catalyst@lists.scsys.co.uk
 Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
 Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
 Dev site: http://dev.catalyst.perl.org/




-- 
Daniel McBrearty
email : danielmcbrearty at gmail.com
http://www.engoi.com
http://danmcb.vox.com
http://danmcb.blogger.com
find me on linkedin and facebook
BTW : 0873928131

___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

2008-06-07 Thread Peter Karman



Daniel McBrearty wrote on 6/7/08 5:25 AM:

of course. But how do you regex an inclusive list for any character in
any human language ?



[\w]+

works pretty well iirc, if your locale includes UTF-8.

--
Peter Karman  .  http://peknet.com/  .  [EMAIL PROTECTED]

___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


RE: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

2008-06-06 Thread Mesdaq, Ali
No escape sequence should get through if you reject any characters
outside of the allowed characters. For example you could just reject the
input and prompt for another input if this regex matches
(?:[^a-zA-Z0-9 _]+)
So escape sequences shouldn't affect this test.

Thanks,
--
Ali Mesdaq (CISSP, GIAC-GREM)
Security Researcher II
Websense Security Labs
http://www.WebsenseSecurityLabs.com
--

-Original Message-
From: Daniel McBrearty [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 05, 2008 11:07 PM
To: The elegant MVC web framework
Cc: DBIx::Class user and developer list
Subject: [Dbix-class] Re: [Catalyst] untainting utf8 text for db

Thanks for the suggestions. Indeed, specifying a list of chars which is
clean (e.g. [a-zA-Z0-9_] for a username in English) is optimum, and I
prefer that. But when you are working with fully multilingual material,
this becomes pretty much impossible. As the site in question is all
about language learning and could eventually handle any language, that
is the issue.

Rejecting some of the suspicious chars you suggest is something I will
do - but even that is not foolproof as there are various ways (more than
one, IIRC, but I'm not sure what they all are) of using escape sequences
to get through.

Of the list you suggest, I'd need to keep (, ), ? - all the rest I could
kill quite happily.

Again, thanks for the input. I'm going to forward this to the
DBIx::Class list (as that is probably where it should have gone in the
first place).

___
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive:
http://www.grokbase.com/group/[EMAIL PROTECTED]


 


 Protected by Websense Messaging Security -- www.websense.com 

___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/