Jeffrey C. Jacobs added the comment:
If I recall, I started this thread with a plan to update re itself with
implementations of various features listed in the top post. If you look at the
list of files uploaded by me there are seme complete patches for Re to add
various features like Atomic
Jeffrey C. Jacobs added the comment:
Thanks Matthew and sorry to put you through more work; I just wanted to verify
exactly which unicode (UTF-16 I take it) were being used to verify if the
UNICODE standard expected them to be treated as unique words or single letters
within a word. Sanskrit
Jeffrey C. Jacobs added the comment:
Maybe you could show us the byte-for-byte hex of the string you're testing so
we can examine if it's really a code point intending word boundary or just a
code point for the sake of beginning a new
Jeffrey C. Jacobs added the comment:
Matthew, I think that is considered a single word in Sanscrit or Thai so Python
3.x is correct. In this case you've written the Sanscrit word for Hindi.
--
___
Python tracker
<http://bugs.py
Changes by Jeffrey C. Jacobs :
--
nosy: +timehorse
___
Python tracker
<http://bugs.python.org/issue17980>
___
___
Python-bugs-list mailing list
Unsubscribe:
Jeffrey C. Jacobs added the comment:
Although V1, V2 is less wordy, technically the current behavior is version
2.2.2, so logically this should be re.VERSION222 vs. re.VERSION3 vs.
re.VERSIONn, with corresponding "(?V222)", "(?V3)" and future "(?Vn)". But
t
Jeffrey C. Jacobs added the comment:
On 1 September 2011 16:12, Matthew Barnett wrote:
>
> Matthew Barnett added the comment:
>
> I think I need a show of hands.
For my part, I recommend literal flags, i.e. re.VERSION222,
re.VERSION300, etc. Then you know exactly what you
Jeffrey C. Jacobs added the comment:
+1 on VC
--
___
Python tracker
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe:
Jeffrey C. Jacobs added the comment:
What about a regex flag? Like regex.W or (?w)?
--
___
Python tracker
<http://bugs.python.org/issue2636>
___
___
Python-bug
Jeffrey C. Jacobs added the comment:
My only addition opinion is that re is very much used in deployed python
applications and was written not just for correctness but also speed. As such,
regex should be benchmarked fairly to show that it is commensurately speedy. I
wouldn'
Jeffrey C. Jacobs added the comment:
Mea culpa et mes apologies,
The '-s' option to John's expressions are indeed executed only once --
they are one-time setup lines. The final quoted expression is what's
run multiple times.
In other words, improving caching in regex w
Jeffrey C. Jacobs added the comment:
Re: timings
Thanks for the info, John. First of all, I really like those tests and
could you please submit a patch or other document so that we could
combine them into the python test suite.
The python test suite, which can be run as part of 'make
Jeffrey C. Jacobs added the comment:
Thanks, Antione! Then I think for the most part any changes to Regexp
will have to wait for 3.2 / 2.7.
--
message_count: 71.0 -> 72.0
___
Python tracker
<http://bugs.python.org/iss
Jeffrey C. Jacobs added the comment:
Okay, as I said, Atomic Grouping, etc., off a recent 2.6 is already
available and I can do any cleanups requested to those already
mentioned, I just don't want to start any new items at the moment. As
it is, we are still over a year from any of
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
The PCRE has some interesting suggestions on how the grammar for a
recursive regular expressions might work. I am concerned about the use of
(?P>name) to call a regexp subexpression as an atomic subroutine. The (?
P>name
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Binary format searches should be supported once issue 1282 is implemented,
likely as part of issue 2636 Item 32. That said, I'm not clear what you
mean by exact search; wouldn't you want match instead? If your main is
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
This is another version of the redundant repeat issue defined in issues
2537 and 1633953 and although not described by the original report for
issue 214033, the comments further down that issue also describe a
similar situatio
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
The duplicate zero-or-one repeat operator bug described in this issue
originally no longer exists in python 2.6.
However, Trent Mick brings up a fair point in that expressions of the
form (x*)? generate an error (issue 1456280
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
On first blush, this issue sounds quite similar to issue 2537, but I
have been looking at different scenarios and found that there is a
subtle difference because, grammatically:
(?m)(?:.*$)(.*$)
is the same as:
(?m)(.*$){2}
Y
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Matthew, I've traced down the patch failures in my merges and now each
of the 4 versions of code on Launchpad should compile, though the first
2 do not pass all the negative look-behind tests, though your later 2
do. Any chanc
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Good work, Matthew. Now, another bazaar hint, IMHO, is once of my
favourite commands: switch. I generally develop all in one directory,
rather than getting a new directory for each branch. Once does have to
be VERY careful to typ
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1456280>
___
_
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue214033>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1282>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7 -Python 2.5
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7 -Python 2.6
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7 -Python 2.4
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Tested on 2.6rc2 and slow but successful. Issue 1662851 may be related.
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7 -Python 2.4
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2650>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
versions: +Python 2.7, Python 3.1 -Python 2.6, Python 3.0
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Implementing Issue 3482 should solve this problem, and I will try to add
it to issue 2636 so that it is captured in the general Regexp 2.7
redesign.
--
nosy: +timehorse
versions: +Pyth
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
versions: +Python 2.7
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
versions: +Python 2.7 -Python 2.5
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.o
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1519638>
___
_
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
versions: +Python 2.7, Python 3.1 -Python 3.0
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
versions: +Python 2.7 -Python 2.6
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
versions: +Python 2.7, Python 3.1 -Python 2.6, Python 3.0
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3482>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3665>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3299>
___
__
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
To clarify, you mean named character sets as found in Perl and Emacs,
which are normally written, for example, like '[:ALPHANUM:]', right? We
are working on that as Item 8 of Issue 2636: Regexp 2.7. If not, please
clarify
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Phew! Okay, all you patches have been applied as I said in a previous
message, and you should now be able to check out
lp:~pythonregexp2.7/python/issue2636+01+09-02+17+18+19+20+21+24+26 where
you can then apply your latest known
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Thanks, Matthew. My reading of that Answer is that you should be okay
because you, I assume, installed the Windows-Native package rather than
the cygwin that I first tested. I think the problem is specific to
Cygwin as well
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Great, Matthew!!
Now, I'm still in the process of setting up branches related to your
work; generally they should be created from a core and set of features
implemented for example:
To get from Version 2 to Version 3 of you
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Matthew,
Did you upload a public SSH key to your Launchpad account?
You're on MS Windows, right? I can try and do an install on an MS
Windows XP box or 2 I have lying around and see how that works, but we
should try and
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Matthew, I'll try to merge all your diffs with the current repository
over the weekend. Having done the first, I know where code differs
between your implementation, mine and the base, so I can apply your
patch, and then a
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Thanks Matthew. You are now part of the pythonregexp2.7 team. I want
to handle integrating Branch 01+09-02+17 myself for now and the other
branches will need to be renamed because I need to add Item 26: Capture
Groups in Look-
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Yes, I see in you rc2+2 diff it was added into that. I will have to
allocate a new number for that fix though, as technically it's a
different feature than variable-length look-behind.
For now I'm having a hard time mergi
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Hmmm. Well, some of those are already covered:
#2636: self
#1160: Item 25
#1647489 : Item 24
#3511: Item 23
#3825: Item 9-2
#433028 : Item 21
#433027 : Item 20
#433024 : Item 19
#3262: Item 22
#3299: TBD
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Perl gives this result for your new expression:
"",undef,undef
undef,undef,"abc"
undef,"",undef
I think it has to do with not thinking of a string as a sequence of
characters, but as a sequence o
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Good catch, Matthew, and if you spot any other outstanding Regular
Expression issues feel free to mention them here.
I'll give issue 1160 an item number of 25 and think all we need to do
here is change SRE_CODE to be type
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
It seems that changing the size type of the Regular Expression Byte-code
is a nice quick-fix, even though it doubles the size of a pattern. It
may have the added benefit that most machine architectures available
today are at
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1160>
___
__
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I've enumerated the current list of Item Numbers at the official
Launchpad page for this issue:
https://launchpad.net/~pythonregexp2.7
There you will find links to each development branch associated with
each item, wher
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I've moved all the development branches to the ~pythonregexp2.7 team so
that we can work collaboratively. You just need to install Bazaar, join
www.launchpad.net, upload your public SSH key and then request to be ad
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Good catch on issue 1647489 Matthew; it looks like this is where that
bug fix will end up going. But, I am unsure if the solution for this
issue is going to be the same as for 3262. I think the solution here is
to add an interna
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Ah, I see the problem, if ptr is not incremented, then it will keep
matching the first expression, (^z*), so it would have to both 'skip'
the 'a' and NOT skip the 'a'. Hmm. You're right, Matth
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Never mind inclusion in 2.6 as no-one has repeated this bug in re-world
examples yet so it's going to have to wait for the Regexp 2.7 engine in
issue 2636.
--
versions: +Python 2.7
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Hmmm. This strikes me as a bug, beyond the realm of Issue 3262. The
two items may be related, but the dropping of the 'a' seems like
unexpected behaviour that I doubt any current code is expecting to
occur. Clearly, wha
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1647489>
___
_
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Thanks for weighing in Matthew!
Yeah, I do get some flack for item 2 because originally item 3 wasn't
supposed to cover named groups but on investigation it made sense that
it should. I still prefer 2 over-all but the nice
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I think this is even more complicated when you consider that
localization my be an issue. Consider "Á": is this grammatically before
"A" or after "a"? From a character set point of view, it is ty
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3511>
___
__
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Matthew,
I am really happy that you are making such progress on your engine, but
can I PLEASE ask you to slow down for a moment? We have a lot of issues
already listed in issue 2636 that is a catch-all for any Python 2.7
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue433028>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue433027>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue433024>
___
__
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I think Mike Coleman proposal of enabling this behaviour via flag is
probably best and IMHO we should consider it under these circumstances.
Intuitively, I think you're interpretation of what re.split should do
under zero-width
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue516762>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3654>
___
__
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
--
nosy: +timehorse
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3262>
___
__
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Update 16 Sep 2008:
Based on the work for issue #3825, I would like to simply update the
item list as follows:
1) Atomic Grouping / Possessive Qualifiers (See also Issue #433030)
[Complete]
2) Match group names as attribute
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I have uploaded my test cases for Atomic Grouping / Possessive
Qualifier, which is the common code we seem to have developed, as this
may be of use to you. I also have documentation, but for now, would you
mind running these tests a
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Well, I implemented this months ago, but have been busy with other
things so I haven't updated in a while. I noticed that the current
version is missing my patches for Atomic Grouping / Possessive
Qualifiers and a number of ot
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Thanks for weighing in Mark! Actually, your point is valid and quite
fair, though I would not assume that Item 3 would be included just
because Item 2 isn't. I will do my best to develop both, but I do not
make the final
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Sorry, as I stated in the last post, I generated the patches then realized
that I was missing the documentation for Item 2, so I have updated the
issue2636-02.patch file and am attaching that separately until the next
release
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file9897/PyLibDiffs.txt
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I have finished work on the Atomic Grouping / Possessive Qualifiers
support and am including a patch to achieve this; however,
http://bugs.python.org/issue2636 should be consulted for the complete list
of changes in the works f
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10470/issue2636-07-only.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10469/issue2636-07.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10468/issue2636-05.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10428/issue2636-05-only.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10467/issue2636.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10052/issue2636-09.patch
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Well, it's time for another update on my progress...
Some good news first: Atomic Grouping is now completed, tested and
documented, and as stated above, is classified as issue2636-01 and
related patches. Secondly, with cav
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10053/issue2636-07.patch
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10470/issue2636-07-only.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10469/issue2636-07.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10429/issue2636-05.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10468/issue2636-05.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10427/issue2636.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10467/issue2636.diff
___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
Mark scribbled:
> One possible solution would be a grouptuples() function that returned
> a tuple of 3-tuples (index, name, captured_text) with the name being
> None for unnamed groups.
Hmm. Well, that's not a bad
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Removed file: http://bugs.python.org/file10056/issue2636-05.patch
__
Tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10429/issue2636-05.diff
__
Tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>:
Added file: http://bugs.python.org/file10428/issue2636-05-only.diff
__
Tracker <[EMAIL PROTECTED]>
<http://bugs.pytho
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:
I am finally making progress again, after a month of changing my
patches from my local svn repository to bazaar hosted on launchpad.net,
as stated in my last update. I also have more or less finished the
probably easiest item, #5
1 - 100 of 114 matches
Mail list logo