Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db
thanks! I've wondered about that. I'd like to see a formal definition of what does get through that filter. Hard to test it on, say, Mandarin CHinese when you don't know the language yourself. But I've a decent database of existing content, I could see if any of that fails this test. Thanks! [\w]+ works pretty well iirc, if your locale includes UTF-8. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db
of course. But how do you regex an inclusive list for any character in any human language ? On Fri, Jun 6, 2008 at 7:46 PM, Mesdaq, Ali [EMAIL PROTECTED] wrote: No escape sequence should get through if you reject any characters outside of the allowed characters. For example you could just reject the input and prompt for another input if this regex matches (?:[^a-zA-Z0-9 _]+) So escape sequences shouldn't affect this test. Thanks, -- Ali Mesdaq (CISSP, GIAC-GREM) Security Researcher II Websense Security Labs http://www.WebsenseSecurityLabs.com -- -Original Message- From: Daniel McBrearty [mailto:[EMAIL PROTECTED] Sent: Thursday, June 05, 2008 11:07 PM To: The elegant MVC web framework Cc: DBIx::Class user and developer list Subject: [Dbix-class] Re: [Catalyst] untainting utf8 text for db Thanks for the suggestions. Indeed, specifying a list of chars which is clean (e.g. [a-zA-Z0-9_] for a username in English) is optimum, and I prefer that. But when you are working with fully multilingual material, this becomes pretty much impossible. As the site in question is all about language learning and could eventually handle any language, that is the issue. Rejecting some of the suspicious chars you suggest is something I will do - but even that is not foolproof as there are various ways (more than one, IIRC, but I'm not sure what they all are) of using escape sequences to get through. Of the list you suggest, I'd need to keep (, ), ? - all the rest I could kill quite happily. Again, thanks for the input. I'm going to forward this to the DBIx::Class list (as that is probably where it should have gone in the first place). ___ List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class IRC: irc.perl.org#dbix-class SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/ Searchable Archive: http://www.grokbase.com/group/[EMAIL PROTECTED] Protected by Websense Messaging Security -- www.websense.com ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/ -- Daniel McBrearty email : danielmcbrearty at gmail.com http://www.engoi.com http://danmcb.vox.com http://danmcb.blogger.com find me on linkedin and facebook BTW : 0873928131 ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Dbix-class] Re: [Catalyst] untainting utf8 text for db
Daniel McBrearty wrote on 6/7/08 5:25 AM: of course. But how do you regex an inclusive list for any character in any human language ? [\w]+ works pretty well iirc, if your locale includes UTF-8. -- Peter Karman . http://peknet.com/ . [EMAIL PROTECTED] ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
RE: [Dbix-class] Re: [Catalyst] untainting utf8 text for db
No escape sequence should get through if you reject any characters outside of the allowed characters. For example you could just reject the input and prompt for another input if this regex matches (?:[^a-zA-Z0-9 _]+) So escape sequences shouldn't affect this test. Thanks, -- Ali Mesdaq (CISSP, GIAC-GREM) Security Researcher II Websense Security Labs http://www.WebsenseSecurityLabs.com -- -Original Message- From: Daniel McBrearty [mailto:[EMAIL PROTECTED] Sent: Thursday, June 05, 2008 11:07 PM To: The elegant MVC web framework Cc: DBIx::Class user and developer list Subject: [Dbix-class] Re: [Catalyst] untainting utf8 text for db Thanks for the suggestions. Indeed, specifying a list of chars which is clean (e.g. [a-zA-Z0-9_] for a username in English) is optimum, and I prefer that. But when you are working with fully multilingual material, this becomes pretty much impossible. As the site in question is all about language learning and could eventually handle any language, that is the issue. Rejecting some of the suspicious chars you suggest is something I will do - but even that is not foolproof as there are various ways (more than one, IIRC, but I'm not sure what they all are) of using escape sequences to get through. Of the list you suggest, I'd need to keep (, ), ? - all the rest I could kill quite happily. Again, thanks for the input. I'm going to forward this to the DBIx::Class list (as that is probably where it should have gone in the first place). ___ List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class IRC: irc.perl.org#dbix-class SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/ Searchable Archive: http://www.grokbase.com/group/[EMAIL PROTECTED] Protected by Websense Messaging Security -- www.websense.com ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/