Python sqlite and regex.
Hi, I'd like to use regular expressions in sqlite query, I using apsw module but it doesn't work...Can you help me ? My script: import apsw import re path = 'db/db.db3' #regexp function (extract from python-list discusion) def regexp(expr, item): reg = re.compile(expr) return reg.match(item) is not None con = apsw.Connection(path) #create function con.createscalarfunction(REGEXP, regexp) cur = con.cursor() #exampl cur.execute(select foo from test where foo regex 'aa.[0-9])) and the error is: cur.execute('select foo from test where foo regex tata') apsw.SQLError: SQLError: near regex: syntax error Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
On Fri, 19 May 2006 14:47:10 +0200, Julien ARNOUX [EMAIL PROTECTED] wrote: cur.execute(select foo from test where foo regex 'aa.[0-9])) and the error is: cur.execute('select foo from test where foo regex tata') apsw.SQLError: SQLError: near regex: syntax error I think you're missing a closing quote on that regex; or perhaps that's an extra closing parenthesis at the end. Also, it's probably best to let the database module do any escaping you may need. For example: fooregex = r'aa.[0-9]' sql = 'select foo from test where foo regex %s' cur.execute( sql, tuple( fooregex ) ) See the DP API spec for more information. Regards, Dan -- Dan Sommers http://www.tombstonezero.net/dan/ I wish people would die in alphabetical order. -- My wife, the genealogist -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
SQLite3 already has a REGEXP function, so you don't need to create your own. As Dan mentioned you also have a problem in your expression: 'aa.[0-9]) You need a closing quote on the expression, and you need to match the close paren with an open paren, or remove it. Also, in case you weren't aware, there will be a sqlite3 module in Python 2.5 based on pysqlite 2.2: http://initd.org/tracker/pysqlite Using pysqlite will make it easier to move to the Python 2.5 sqlite3 module if that's important to you. -- Matt Good -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Matt Good wrote: SQLite3 already has a REGEXP function, so you don't need to create your own. [...] Yes, but SQLite does not include a regular expression engine, and thus according to the SQLite docs you need to register a REGEXP function in order to make the REGEXP operator work: The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If a user-defined function named regexp is defined at run-time, that function will be called in order to implement the REGEXP operator. - -- Gerhard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFEbfbQdIO4ozGCH14RAqqqAJ49/9Kpi8xA6AyGB0tVJ/JcU4MczgCgoIsW gdYgUl9ge63CiHqj4Mzgpns= =1ZDV -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
Gerhard Häring [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Matt Good wrote: SQLite3 already has a REGEXP function, so you don't need to create your own. [...] Yes, but SQLite does not include a regular expression engine, and thus according to the SQLite docs you need to register a REGEXP function in order to make the REGEXP operator work: The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If a user-defined function named regexp is defined at run-time, that function will be called in order to implement the REGEXP operator. This is very interesting. So I *could* define my own regexp function that processes not regular expressions, but say, glob-like strings, which are usually much easier for end users to work with (very basic wild-carding where '*' matches one or more characters, and '?' matches any single character - maybe add '#' to match any single digit and '@' to match any single alpha character). -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
Paul McGuire [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] where '*' matches one or more characters, and '?' matches any single oops, I meant '*' matches zero or more characters. In many applications, these tests are sufficient for most user queries. And this eliminates the problem of teaching application users how to create queries using the full regular expression syntax. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
Paul McGuire wrote: Paul McGuire [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] where '*' matches one or more characters, and '?' matches any single oops, I meant '*' matches zero or more characters. '?' also matches 0 characters -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
John Salerno [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Paul McGuire wrote: Paul McGuire [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] where '*' matches one or more characters, and '?' matches any single oops, I meant '*' matches zero or more characters. '?' also matches 0 characters Maybe it does, and maybe it doesn't... :) Here is my test with the glob module (in my directory, there is a file named PUTTY.RND): glob.glob('PUTTY.RND') ['PUTTY.RND'] glob.glob('PUTTY.RN?') ['PUTTY.RND'] glob.glob('PUTTY.RN?D') [] glob.glob('PUTTY.RN?') ['PUTTY.RND'] glob.glob('PUTTY.RND?') [] glob.glob('PUTT?.RND') ['PUTTY.RND'] glob.glob('PUTTY?.RND') [] glob.glob('PUTT?Y.RND') [] Looks like '?' does *not* match zero characters in glob. On the other hand, in the Windows console window, it appears that '?' *does* match zero characters. Of course, you could write your regexp() routine to interpret '?' any way you wanted. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
On Fri, 19 May 2006 17:44:45 GMT, Paul McGuire [EMAIL PROTECTED] wrote: Gerhard Häring [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If a user-defined function named regexp is defined at run-time, that function will be called in order to implement the REGEXP operator. This is very interesting. So I *could* define my own regexp function that processes not regular expressions, but say, glob-like strings, which are usually much easier for end users to work with (very basic wild-carding where '*' matches one or more characters, and '?' matches any single character - maybe add '#' to match any single digit and '@' to match any single alpha character). Doesn't SQL already have lightweight wildcards? SELECT somefield FROM sometable WHERE someotherfield LIKE '%foo%' Regards, Dan -- Dan Sommers http://www.tombstonezero.net/dan/ I wish people would die in alphabetical order. -- My wife, the genealogist -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
Dan Sommers [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Fri, 19 May 2006 17:44:45 GMT, Paul McGuire [EMAIL PROTECTED] wrote: Gerhard Häring [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If a user-defined function named regexp is defined at run-time, that function will be called in order to implement the REGEXP operator. This is very interesting. So I *could* define my own regexp function that processes not regular expressions, but say, glob-like strings, which are usually much easier for end users to work with (very basic wild-carding where '*' matches one or more characters, and '?' matches any single character - maybe add '#' to match any single digit and '@' to match any single alpha character). Doesn't SQL already have lightweight wildcards? SELECT somefield FROM sometable WHERE someotherfield LIKE '%foo%' Yes it does - '%' is like '*', and '_' is like '?'. But it is rare for application users to be familiar with these. In my experience, users are more likely to have seen '*' and '?'. The reason I keep citing end/application users, as opposed to developers, is for the case where the user has filled in some kind of wildcard search field, to be passed to SQL in a query. Woe to the app designer who figures that users want to enter a regular expression in such a field. '%' and '_' wildcards a little better, but frankly, I think '*' and '?' looks more like a wildcards than '%' and '_'. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
Oops, sorry about the confusion regarding the built-in REGEXP. That's kind of disappointing. It would appear that the user-defined regexp function in the original post should work assuming the SQL and regex syntax errors are corrected. However, there *is* a GLOB built-in to SQLite 3 that has a default registered implementation (though it can be overriden by a user-defined glob method if necessary). -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
On Fri, 19 May 2006 18:52:38 GMT, Paul McGuire [EMAIL PROTECTED] wrote: Dan Sommers [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Doesn't SQL already have lightweight wildcards? SELECT somefield FROM sometable WHERE someotherfield LIKE '%foo%' Yes it does - '%' is like '*', and '_' is like '?'. But it is rare for application users to be familiar with these. In my experience, users are more likely to have seen '*' and '?'. The reason I keep citing end/application users, as opposed to developers ... Users, application users, end users, etc., vary over time. Since this is c.l.p., though, we'd better just agree to disagree over which user(s) may or may not be familiar with which (or any) wildcard conventions. ... the case where the user has filled in some kind of wildcard search field, to be passed to SQL in a query. Woe to the app designer who figures that users want to enter a regular expression in such a field ... Woe to the app designer who fails to target the app to the users, make it clear to the users what is expected in any given field, and accept constructive criticism from unhappy users. ;-) Knowing your users is one of the most important factors in software design. ... '%' and '_' wildcards a little better, but frankly, I think '*' and '?' looks more like a wildcards than '%' and '_'. Ah, but then you're not an end user, are you? ;-) Regards, Dan -- Dan Sommers http://www.tombstonezero.net/dan/ I wish people would die in alphabetical order. -- My wife, the genealogist -- http://mail.python.org/mailman/listinfo/python-list
Re: Python sqlite and regex.
John Salerno [EMAIL PROTECTED] writes: Paul McGuire wrote: Paul McGuire [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] where '*' matches one or more characters, and '?' matches any single oops, I meant '*' matches zero or more characters. '?' also matches 0 characters Not in globs. In a glob, '?' matches any one character, '*' matches any zero or more characters. In a regex, '.' matches any one character, '?' matches the preceding atom zero or one times, '*' matches the preceding atom zero or more times, and '+' matches the preceding atom one or more times. They're quite different syntaxes, but confusingly similar in appearance. On most GNU+Linux systems, these two commands get the relevant manual pages: $ man 7 glob $ man 7 regex -- \ When I get real bored, I like to drive downtown and get a | `\ great parking spot, then sit in my car and count how many | _o__) people ask me if I'm leaving. -- Steven Wright | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list