[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Just another rather marginal findings; differences between regex and re: regex.findall(r[\B], aBc) ['B'] re.findall(r[\B], aBc) [] (Python 2.7 ... on win32; regex - issue2636-20100912.zip) I believe, regex is more correct here, as

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100913.zip is a new version of the regex module. I've removed the ZEROWIDTH flag and added the NEW flag, which turns on the new behaviour such as splitting on zero-width matches and positional flags. If the NEW flag

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-11 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100912.zip is a new version of the regex module. More speedups. I've been comparing the speed against Perl wherever possible. In some cases Perl is lightning fast, probably because regex is built into the language and

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-23 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100824.zip is a new version of the regex module. More speedups. Getting towards Perl speed now, depending on the regex. :-) -- Added file: http://bugs.python.org/file18621/issue2636-20100824.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-22 Thread Giampaolo Rodola'
Changes by Giampaolo Rodola' g.rod...@gmail.com: -- nosy: +giampaolo.rodola ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-17 Thread A.M. Kuchling
Changes by A.M. Kuchling li...@amk.ca: -- nosy: -akuchling ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-15 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100816.zip is a new version of the regex module. Unfortunately I came across a bug in the handing of sets. More unit tests added. -- Added file: http://bugs.python.org/file18541/issue2636-20100816.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-14 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100814.zip is a new version of the regex module. I've added default Unicode word boundaries and renamed the Pattern and Match classes. Over to you, Alex. :-) -- Added file:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-14 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 14 August 2010 21:24, Matthew Barnett rep...@bugs.python.org wrote: Over to you, Alex. :-) Et voilà, an exciting Saturday evening http://pypi.python.org/pypi/regex/0.1.20100814 Matthew, I'm currently keeping regex in a private bzr

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-29 Thread Georg Brandl
Georg Brandl ge...@python.org added the comment: Wishlist item: could you give the regex and match classes nicer names, so that they can be referenced as `regex.Pattern` (or `regex.Regex`) and `regex.Match`? -- ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-26 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Does 'regex' implement default word boundaries (see #7255)? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-26 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: No. Wouldn't that break compatibility with 're'? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-26 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs timeho...@users.sourceforge.net added the comment: What about a regex flag? Like regex.W or (?w)? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-26 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: That's a possibility. I must admit that I don't entirely understand it enough to implement it (the OP said I don't believe that the algorithm for this is a whole lot more complicated), and I don't have a need for it myself, but if

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-25 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 25 July 2010 03:46, Matthew Barnett rep...@bugs.python.org wrote: issue2636-20100725.zip is a new version of the regex module. This is now packaged and uploaded to PyPI http://pypi.python.org/pypi/regex/0.1.20100725 --

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-24 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100725.zip is a new version of the regex module. More tweaks for speed. re regex bm_regex_compile.py 87.05secs 278.00secs bm_regex_effbot.py 14.00secs6.58secs

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-19 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: This has already been reported in issue #3511. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-18 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100719.zip is a new version of the regex module. Just a few more tweaks for speed. -- Added file: http://bugs.python.org/file18054/issue2636-20100719.zip ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-18 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Thanks for the update; Just a small observation regarding some character ranges and ignorecase, probably irrelevant, but a difference to the current re anyway: zero2z =

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-13 Thread Jonathan Halcrow
Jonathan Halcrow jonathan.halc...@gmail.com added the comment: The most recent version on pypi (20100709) seems to be missing _regex_core from py_modules in setup.py. Currently import regex fails, unable to locate _regex_core. -- nosy: +jhalcrow

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-13 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 13 July 2010 22:34, Jonathan Halcrow rep...@bugs.python.org wrote: The most recent version on pypi (20100709) seems to be missing _regex_core from py_modules in setup.py. Sorry, my fault. I've uploaded a corrected version

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-08 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100709.zip is a new version of the regex module. I've moved most of the regex module's Python code into a private module. -- Added file: http://bugs.python.org/file17912/issue2636-20100709.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-07 Thread Mark Summerfield
Mark Summerfield m...@qtrac.eu added the comment: On the PyPI page: http://pypi.python.org/pypi/regex/0.1.20100706.1 in the Subscripting for groups bullet it gives this pattern: r(?before.*?)(?num\\d+)(?after.*) Shouldn't this be: r(?Pbefore.*?)(?Pnum\\d+)(?Pafter.*) Or has a new syntax been

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-07 Thread Mark Summerfield
Mark Summerfield m...@qtrac.eu added the comment: If you do: import regex as re dir(re) you get over 160 items, many of which begin with an underscore and so are private. Couldn't __dir__ be reimplemented to eliminate them. (I know that the current re module's dir() also returns private

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-07 Thread Mark Summerfield
Mark Summerfield m...@qtrac.eu added the comment: I was wrong about r(?name.*). It is valid in the new engine. And the PyPI docs do say so immediately _following_ the example. I've tried all the examples in Programming in Python 3 second edition using import regex as re and they all worked.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-07 Thread Georg Brandl
Georg Brandl ge...@python.org added the comment: Mark, __dir__ as a special method only works when defined on types, so you'd have to use a module subclass for the regex module :) As I already suggested, it is probably best to move most of the private stuff into a separate module, and only

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Matthew, I'd like to see at least some of these features in 3.2, but ISTM that after more than 2 years this issue is not going anywhere. Is the module still under active development? Is it ready? Is it waiting for reviews and to be added

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: I've packaged Matthew's latest revision and uploaded it to PyPI. This version will build for Python 2 and Python 3, parallel installs will coexist on the same machine. -- ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: I started with trying to modify the existing re module, but I wanted to make too many changes, so in the end I decided to make a clean break and start on a new implementation which was compatible with the existing re module and

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: So, if it's pretty much ready, do you think it could be included already in 3.2? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Brian Curtin
Brian Curtin cur...@acm.org added the comment: Before anything else is done with it, it should probably be announced in some way. I'm not sure if anyone has opened any of these zip files, reviewed anything, ran anything, or if anyone even knows this whole thing has been going on. --

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Yes, as I said in the previous message it should probably be announced on python-dev and see what the others think. I don't know how much the module has been used in the wild, but since there has been a PyPI package available for a few

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: The file at: http://pypi.python.org/pypi/regex was downloaded 75 times, if that's any help. (Now reset to 0 because of the bug fix.) If it's included in 3.2 then there's the question of whether it should replace the re module

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: If it's backward-compatible with the 're' module, all the tests of the test suite pass and it just improves it and add features I don't see why not. (That's just my personal opinion though, other people might (and probably will)

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs timeho...@users.sourceforge.net added the comment: My only addition opinion is that re is very much used in deployed python applications and was written not just for correctness but also speed. As such, regex should be benchmarked fairly to show that it is commensurately

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Thanks for the prompt fix! It would indeed be nice to see this enhanced re module in the standard library e.g. in 3.2, but I also really appreciate, that also multiple 2.x versions are supported (as my current main usage of this

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Georg Brandl
Georg Brandl ge...@python.org added the comment: FWIW, I'd love seeing the updated regex module in 3.2. Please do bring it up on python-dev. Looking at the latest module on PyPI, I noted that the regex.py file is very long (~3500 lines), even though it is quite compressed (e.g. no blank

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 6 July 2010 18:03, Matthew Barnett rep...@bugs.python.org wrote: The file at http://pypi.python.org/pypi/regex/ was downloaded 75 times, if that's any help. (Now reset to 0 because of the bug fix.) Each release was downloaded between 50

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-06 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: As a crude guide of the speed difference, here's Python 2.6: re regex bm_regex_compile.py 86.53secs 260.19secs bm_regex_effbot.py 13.70secs8.94secs bm_regex_v8.py 15.66secs

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-05 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I just noticed a somehow strange behaviour in matching character sets or alternate matches which contain some more advanced unicode characters, if they are in the search pattern with some simpler ones. The former seem to be ignored

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-07-05 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100706.zip is a new version of the regex module. I've added your examples to the unit tests. The module now passes. Keep up the good work! :-) -- Added file:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-06-19 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- versions: +Python 3.2 -Python 2.7, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 13 April 2010 03:21, Matthew Barnett rep...@bugs.python.org wrote: issue2636-20100413.zip is a new version of the regex module. Matthew, When I run test_regex.py 6 tests are failing, with Python 2.6.5 on Ubuntu Lucid and my setup.py.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: Yes, it passed all the tests, although I've since found a minor bug that isn't covered/caught by them, so I'll need to add a few more tests. Anyway, do: regex.match(ur\p{Ll}, ua) regex.match(ur'(?u)\w', u'\xe0') really

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 13 April 2010 18:10, Matthew Barnett rep...@bugs.python.org wrote: Anyway, do:    regex.match(ur\p{Ll}, ua)    regex.match(ur'(?u)\w', u'\xe0') really return None? Your results suggest that they won't. Python 2.6.5 (r265:79063,

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100414.zip is a new version of the regex module. I think I might have identified the cause of the problem, although I still haven't been able to reproduce it, so I can't be certain. --

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: Oops, forgot the file! :-) -- Added file: http://bugs.python.org/file16916/issue2636-20100414.zip ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-13 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 14 April 2010 00:33, Matthew Barnett rep...@bugs.python.org wrote: I think I might have identified the cause of the problem, although I still haven't been able to reproduce it, so I can't be certain. Performed 76 Passed Looks like you

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-04-12 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100413.zip is a new version of the regex module. It includes additional speed-ups. -- Added file: http://bugs.python.org/file16905/issue2636-20100413.zip ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-31 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100331.zip is a new version of the regex module. It includes speed-ups and a minor bugfix. -- Added file: http://bugs.python.org/file16709/issue2636-20100331.zip ___ Python

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-22 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100323.zip is a new version of the regex module. It now includes a test script. Most of the tests come from the existing test scripts. -- Added file: http://bugs.python.org/file16626/issue2636-20100323.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-16 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: I've adapted the Python 2.6.5 test_re.py as follows, from test.test_support import verbose, run_unittest -import re -from re import Scanner +import regex as re +from regex import Scanner and run it against regex-2010305. Three tests failed,

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-16 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Does regex.py have its own test suite (which also includes tests for all the problems reported in the last few messages)? If so, the new tests could be merged in re's test_re. This will simplify the testing of regex.py and will improve the

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-16 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I am not sure about the testsuite for this regex module, but it seems to me, that many of the problems reported here probably don't apply for the current builtin re, as they are connected with the new features of regex. After the

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-03 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I just noticed a cornercase with the newly introduced grapheme matcher \X, if this is used in the character set: regex.findall(\X, abc) ['a', 'b', 'c'] regex.findall([\X], abc) Traceback (most recent call last): File input, line 1,

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-03 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: \X shouldn't be allowed in a character class because it's equivalent to \P{M}\p{M}*. It's a bug, now fixed in issue2636-20100304.zip. I'm not convinced about the set intersection and difference stuff. Isn't that overdoing it a

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-03-03 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Actually I had that impression too, but I was mainly surprised with these requirements being on the lowest level of the unicode support. Anyway, maybe the relevance of these guidelines for the real libraries is is lower, than I

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-26 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 26 February 2010 03:20, Matthew Barnett rep...@bugs.python.org wrote: Added file: http://bugs.python.org/file16375/issue2636-20100226.zip This is now uploaded to PyPI http://pypi.python.org/pypi/regex/0.1.20100226 -- Alex Willmer

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-25 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100226.zip is a new version of the regex module. It now supports the branch reset (?|...|...), enabling the different branches of an alternation to reuse group numbers. -- Added file:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-24 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100224.zip is a new version of the regex module. It includes support for matching based on Unicode scripts as well as on Unicode blocks and properties. -- Added file:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-24 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Thanks, its indeed a very nice addition to the library... Just a marginal remark; it seems, that in script-names also some non BMP characters are covered, however, in the unicode ranges thee only BMP.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-22 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Is the issue2636-20100222.zip archive supposed to be complete? I can't find not only the rst or html features, but more importantly the py and pyd files for the particular versions. Anyway, I just skimmed through the

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-22 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: I don't know what happened there. I didn't notice that the zip file was way too small. Here's a replacement (still called issue2636-20100222.zip). Unicode script properties are already included, at least those whose definitions at

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-22 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: OK, you've convinced me, \X is supported. :-) issue2636-20100223.zip is a new version of the regex module. -- Added file: http://bugs.python.org/file16331/issue2636-20100223.zip ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-22 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 22 Feb 2010, at 21:24, Matthew Barnett rep...@bugs.python.org wrote: issue2636-20100222.zip is a new version of the regex module. This new version adds reverse searching. The 'features' now come in ReStructuredText (.rst) and HTML

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-22 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Wow, that's what can be called rapid development :-), thanks very much! I did'n noticed before, that \G had been implemented already. \X works fine for me, it also maintains the input string indices correctly. We can use unicode

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-21 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: On 17 February 2010 19:35, Matthew Barnett rep...@bugs.python.org wrote: The main text at http://pypi.python.org/pypi/regex appears to have lost its backslashes, for example:    The Unicode escapes u and U are supported.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-18 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Thanks for fixing the argument positions; unfortunately, it seems, there might be some other problem, that makes my code work differently than the builtin re; it seems, in the character classes the ignorcase flag is ignored somehow:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-18 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100219.zip is a new version of the regex module. The regex module should give the same results as the re module for backwards compatibility. The ignorecase bug is now fixed. This new version releases the GIL when

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-17 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: I've packaged this latest revision and uploaded to PyPI http://pypi.python.org/pypi/regex -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-17 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: The main text at http://pypi.python.org/pypi/regex appears to have lost its backslashes, for example: The Unicode escapes u and U are supported. instead of: The Unicode escapes \u and \U are

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-17 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I just tested the fix for unicode tracebacks and found some possibly weird results (not sure how/whether it should be fixed, as these inputs are indeed rather artificial...). (win XPp SP3 Czech, Python 2.6.4) Using the cmd console,

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-17 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100218.zip is a new version of the regex module. I've added '.' to the permitted characters when parsing the name of a property. The name itself is no longer reported in the error message. I've also corrected the

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-10 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Thanks for the quick update, I confirm the fix for both issues; just another finding (while testing the behaviour mentioned previously - msg91917) The property name normalisation seem to be much more robust now, I just encountered an

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-10 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: I've been aware for some time that exception messages in Python 2 can't be Unicode, but I wasn't sure which encoding to use, so I've decided to use that of sys.stdout. It appears to work OK in IDLE and at the Python prompt.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-09 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I'd like to add another issue I encountered with the latest version of regex - issue2636-20100204.zip It seems, that there is an error in handling some quantifiers in python 2.5 on Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-09 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100210.zip is a new version of the regex module. The reported bugs appear to be fixed now. -- Added file: http://bugs.python.org/file16195/issue2636-20100210.zip ___ Python

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-08 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Hi, thanks for the update! Just for the unlikely case, it hasn't been noticed sofar, using python 2.6.4 or 2.5.4 with the regexp build issue2636-20100204.zip I am getting the following easy-to-fix error: Python 2.6.4 (r264:75708, Oct

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-02-03 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100204.zip is a new version of the regex module. I've added splititer and added a build for Python 3.1. -- versions: +Python 3.1 Added file: http://bugs.python.org/file16122/issue2636-20100204.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-01-15 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20100116.zip is a new version of the regex module. I've given up on the breadth-wise matching - it was too difficult finding a pattern structure that would work well for both depth-first and breadth-wise. It probably

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-12-31 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- priority: - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-24 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I'd like to add some detail to the previous msg91473 The current behaviour of the character properties looks a bit surprising sometimes: regex.findall(ur\p{UppercaseLetter}, uQW\p{UppercaseLetter}as) [u'Q', u'W', u'U', u'L']

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-17 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: Matthew's 20080915.zip attachment is now on PyPI. This one, having a more complete MANIFEST, will build for people other than me. -- ___ Python tracker rep...@bugs.python.org

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-15 Thread Mark Summerfield
Mark Summerfield m...@qtrac.eu added the comment: Hi, I've noticed 3 differences between the re and regex engines. I don't know if they are intended or not, but thought it best to mention them. (I used the issue2636-20090810#3.zip version.) Python 2.6.2 (r262:71600, Apr 20 2009, 09:25:38)

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-15 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Simplification of mark's first two problems: Problem 1: looks like regex's negative look-head assertion is broken re.findall(r'(?!a)\w', 'abracadabra') ['b', 'r', 'c', 'd', 'b', 'r'] regex.findall(r'(?!a)\w', 'abracadabra') []

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-15 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20090815.zip fixes the bugs found in msg91598 and msg91607. The regex engine currently lacks some of the optimisations that the re engine has, but I've concluded that even with them the extra work that the engine needs to

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-13 Thread Alex Willmer
Alex Willmer a...@moreati.org.uk added the comment: I've made an installable package of Matthew Barnett's patch. It may get this to a wider audience. http://pypi.python.org/pypi/regex Next I'll look at incorporating Andrew Kuchling's suggestion of the re tests from CPython. --

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-12 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs timeho...@users.sourceforge.net added the comment: /lurk Re: timings Thanks for the info, John. First of all, I really like those tests and could you please submit a patch or other document so that we could combine them into the python test suite. The python test suite,

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-12 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Remember, though, that when run as a single instance, at least in the existing engine, the re compiler caches recent compiles, so repeatedly compiling an expression flattens the overhead in a single run to a single compile and lookup, where

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-12 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs timeho...@users.sourceforge.net added the comment: Mea culpa et mes apologies, The '-s' option to John's expressions are indeed executed only once -- they are one-time setup lines. The final quoted expression is what's run multiple times. In other words, improving caching in

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-12 Thread Walter Dörwald
Changes by Walter Dörwald wal...@livinglogic.de: -- nosy: -doerwalter ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-12 Thread Collin Winter
Collin Winter coll...@gmail.com added the comment: FYI, Unladen Swallow includes several regex benchmark suites: a port of V8's regex benchmarks (regex_v8); some of the regexes used when tuning the existing sre engine 7-8 years ago (regex_effbot); and a regex_compile benchmark that tests

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-11 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: Sorry for the dumb question, which may also suggest, that I'm unfortunately unable to contribute at this level (with zero knowledge of C and only working one for Python): Where can I find the sources for tests etc. and how they are

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-11 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Take a look a the dev FAQ, linked from http://www.python.org/dev. The tests are in Lib/test in a distribution installed from source, but ideally you would be (anonymously) pulling the trunk from SVN (when it is back) and creating your

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-11 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: What is the expected timing comparison with re? Running the Aug10#3 version on Win XP SP3 with Python 2.6.3, I see regex typically running at only 20% to %50 of the speed of re in ASCII mode, with not-very-atypical tests (find all

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-10 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: First, many thanks for this contribution; it's great, that the re module gets updated in that comprehensive way! I'd like to report some issue with the current version (issue2636-20090804.zip). Using an empty string as the search

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-10 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Adding to vbr's report: [2.6.2, Win XP SP3] (1) bug mallocs memory inside loop (2) also happens to regex.findall with patterns 'a{0,0}' and '\B' (3) regex.sub('', 'x', 'abcde') has similar problem BUT 'a{0,0}' and '\B' appear to work

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-10 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20090810.zip should fix the empty-string bug. -- Added file: http://bugs.python.org/file14682/issue2636-20090810.zip ___ Python tracker rep...@bugs.python.org

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-10 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: issue2636-20090810#2.zip has some further improvements and bugfixes. -- Added file: http://bugs.python.org/file14683/issue2636-20090810#2.zip ___ Python tracker rep...@bugs.python.org

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2009-08-10 Thread Vlastimil Brom
Vlastimil Brom vlastimil.b...@gmail.com added the comment: I'd like to confirm, that the above reported error is fixed in issue2636-20090810#2.zip While testing the new features a bit, I noticed some irregularity in handling the Unicode Character Properties; I tried randomly some of those

<    1   2   3   >