On Mon, 3 Aug 2020, Gaurav Mittal11 wrote:
> I am compiling PCRE 8.44 on HP-UX B.11.31 U ia64 with below options.
>
>
> export CC=/opt/aCC/bin/aCC
> export CFLAGS="+DD64 -mt"
> export CPPFLAGS="+DD64 -mt"
> export LDFLAGS="-L/usr/lib/hpux64/"
>
>
> It is compiling successfully from my own
I have just put 10.35 tarballs in the usual place:
https://ftp.pcre.org/pub/pcre/pcre2-10.35.tar.gz
https://ftp.pcre.org/pub/pcre/pcre2-10.35.tar.bz2
https://ftp.pcre.org/pub/pcre/pcre2-10.35.tar.zip
Since the release candidate, there has only been one change to the
library code (adding support
On Mon, 4 May 2020, ND via Pcre-dev wrote:
> /\A(?:\1b|(?=(a)))*\z/
> ab
> No match
>
>
> Both patterns must successfully match after second iteration.
Perl does exactly the same as PCRE. The problem is that analysing the
pattern to discover that matching nothing in one branch might make a
On Fri, 24 Apr 2020, Petr Pisar via Pcre-dev wrote:
> I think it's a mistake in PCRE2 code.
Oh, I was looking at something completely different. The PCHARS and
PCHARSV macros print character strings in different bit-widths by
calling appropriate width-specific functions. In many cases they are
On Fri, 24 Apr 2020, Petr Pisar via Pcre-dev wrote:
> > I have committed some revised code. Does this solve the issue?
> >
> It does not. The compiler is too smart (or dumb). The warning has changed
> into:
Oh, how annoying. There's another way of solving this, but it's more
complicated, which
On Thu, 16 Apr 2020, enh via Pcre-dev wrote:
> done. i've attached a patch that includes both the configure and cmake
> bits. i tested both with CC=gcc and CC=clang (for representative
> failure and success cases respectively).
Patch now applied and committed. However, I did have to make a
On Thu, 16 Apr 2020, Petr Pisar via Pcre-dev wrote:
> I noticed a new warning with GCC 10:
>
> gcc -DHAVE_CONFIG_H -I. -I./src "-I./src" -pthread -O2 -g -pipe -Wall
> -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
> -fexceptions -fstack-protector-strong
On Tue, 21 Apr 2020, enh via Pcre-dev wrote:
> it seems to be a fairly random mix of monospaced and proportional text
> for me. for example, the first line "Index:" is proportional, but then
> the --- and +++ lines are monospaced, and it goes back and forth a
> lot. (in both Chrome and Firefox.)
On Mon, 20 Apr 2020, enh via Pcre-dev wrote:
> > Thank you. I will deal with this in a day or two (diverted elsewhere at
> > the moment) along with several other minor tweaks that have just
> > arrived.
>
> thanks! (i was worried that the patch got mangled by the mailing list
> because it looks
On Wed, 15 Apr 2020, enh via Pcre-dev wrote:
> -PCRE2_SPTR stack_frames_vector[START_FRAMES_SIZE/sizeof(PCRE2_SPTR)];
> +PCRE2_SPTR stack_frames_vector[START_FRAMES_SIZE/sizeof(PCRE2_SPTR)]
> __attribute__((uninitialized));
> mb->stack_frames = (heapframe *)stack_frames_vector;
>
> i'm happy to
On Thu, 16 Apr 2020, Petr Pisar via Pcre-dev wrote:
> All tests pass with JIT enabled where available on GNU/Linux on these
> platforms:
Thank you.
> I noticed a new warning with GCC 10:
I'm still on gcc 9.3.0 (Arch Linux), which doesn't show that. I'll do
something about it. Maybe Arch will
I've just put the 10.35-RC1 testing release here:
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.35-RC1.tar.gz
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.35-RC1.tar.bz2
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.35-RC1.tar.zip
Bugs are fixed and there are a few new features: see NEWS and
On Thu, 5 Mar 2020, I wrote:
> > In dftables, add a -b option to save the table buffer after computation in
> > binary format instead of the C format ;
> > the file argument is unchanged.
>
> That is a useful idea; I will consider it.
OK, I have now done that and committed the patch. While I
On Wed, 4 Mar 2020, Patrice Guérin wrote:
> To be consistent with the pcre2_maketables_free() function in terms of
> alloc/free usage,
> provide a pcre2_maketables_reserve() function :
>
> PCRE2_EXP_DEFN uint8_t * PCRE2_CALL_CONVENTION
> pcre2_maketables_reserve(pcre2_general_context *gcontext,
On Thu, 27 Feb 2020, Dvir L via Pcre-dev wrote:
> I've tried to upgrade to 8.44, and got the same result.
> I didn't include the full pattern in the previous e-mail. Its something
> like - [A-Za-z]{1}[A-Za-z\d_]*\.
Needless to say, that works fine on my 64-bit Linux box. Sorry I can't
offer any
On Wed, 26 Feb 2020, Dvir L via Pcre-dev wrote:
> I'm using pcre 8.34 on a 32 bit Android device.
PCRE1 (the 8.xx series) is obsolete and will probably never have another
release (8.44 is recently out). All development happens in PCRE2 (the
10.xx series). As it is now 5 years since PCRE2 came
On Mon, 17 Feb 2020, I wrote:
> On Fri, 14 Feb 2020, Patrice Guérin wrote:
>
> > At my opinion, pcre2_maketables() is independant of 8/16/32 bits since it's
> > defined as uint8_t (ie bytes).
> > For the same reason, I think there is no endianness issue in the computation
> > of the table.
> >
On Fri, 21 Feb 2020, Kilian Kilger via Pcre-dev wrote:
> Matching with jit, it was very easy to produce an example which
> exceeds the available resources: We take the pattern
> "(*LIMIT_MATCH=10)(x+x+x+x+)+y" and as subject we take a string of
> length 10 containing only the letter "x".
>
>
On Fri, 14 Feb 2020, Patrice Guérin wrote:
> At my opinion, pcre2_maketables() is independant of 8/16/32 bits since it's
> defined as uint8_t (ie bytes).
> For the same reason, I think there is no endianness issue in the computation
> of the table.
> Saving and loading in binary should be ok.
I
On Thu, 13 Feb 2020, Patrice Guérin wrote:
> I'm facing some problems with the locale character table definitions.
Locales are a nightmare. We will all be able to rejoice when Unicode is
everywhere. I'm afraid I know very little about locales, and as I'm a
Linux user, I know nothing about
On Fri, 14 Feb 2020, Kilian Kilger via Pcre-dev wrote:
> we try to use PCRE2 to match UCS-2 encoding, i.e. UTF-16 without any
> check for "broken" surrogates or any other invalid unicode. In UCS-2
> encoding every character is 2 bytes and every 2-byte sequence is
> accepted as a valid character.
I have just put the PCRE1 8.44 release here:
https://ftp.pcre.org/pub/pcre/pcre-8.44.tar.gz
https://ftp.pcre.org/pub/pcre/pcre-8.44.tar.bz2
https://ftp.pcre.org/pub/pcre/pcre-8.44.tar.zip
It is a year since the last PCRE1 release. There are only 7 logged
changes, and only two of them fix a real
On Mon, 10 Feb 2020, Rob Harrison wrote:
> Please can you add the information about not supporting Null Terminated
> Strings to the man page for pure_jit_match to avoid others also hitting the
> same brick wall?
Done. Thank you for pointing out this omission; sorry that you had to
spend so
On Tue, 24 Dec 2019, Ralf Junker wrote:
> With PCRE2 SVN revision 1193, pcre2text changes the output of testinput8.
>
> For LINK_SIZE=2, the corresponding result files have been adjusted
> accordingly.
>
> For LINK_SIZE=3 and LINK_SIZE=4 these files must still be updated:
>
>
On Tue, 24 Dec 2019, Ralf Junker wrote:
> For LINK_SIZE=2, the corresponding result files have been adjusted
> accordingly.
>
> For LINK_SIZE=3 and LINK_SIZE=4 these files must still be updated:
Thanks for noticing that. I didn't test with those link sizes (not
realizing it mattered), but this
On Mon, 2 Dec 2019, Ze'ev Atlas wrote:
> When I started, Philip had assured me that pcre2_jit_compile.c is
> indeed in src but does nothing in my context. He had also assired me
> that most of the rest of jit related code would be in src/sljit. I do
> not understand why the change of heart.
On Mon, 2 Dec 2019, Zoltán Herczeg wrote:
> those files are intended to be there. They are pcre-jit specific,
Alongside pcre2_jit_compile.c etc...
Philip
--
Philip Hazel
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
On Tue, 19 Nov 2019, Zoltán Herczeg wrote:
> Anyway I suspect Philip wants to release PCRE2 as soon as possible, so
> if you don't mind we could track this down after the release.
I see you have fixed this. Thanks to both of you for getting that done.
Shall I go ahead with 10.34 now? Actually,
On Mon, 11 Nov 2019, Petr Pisar via Pcre-dev wrote:
> Frankly I don't believe there is a way of solving it and I'd just keep the
> warning there. Using C99 conformant compilers is the correct way. E.g. passing
> -std=c99 to GCC with glibc fixes the warning. I'd just document it somwehere
> in
I've just discovered that the University of Cambridge web hosting
service, where I've had a small web site for distributing some of my
non-PCRE software, has been closed down.
So ... what advice can anybody give me about finding somewhere to
distribute a few software packages via a web site? I
On Tue, 12 Nov 2019, Zoltán Herczeg wrote:
> Patch landed. Thank you for fixing all issues.
Yes, many thanks to everybody. Do we need another RC, or should I just
go ahead with a full release in a few days' time?
Philip
--
Philip Hazel
--
## List details at
On Thu, 7 Nov 2019, Petr Pisar via Pcre-dev wrote:
> I can see GCC 4.8.5 prints these warnings on 32-bit PowerPC:
>
> gcc -DHAVE_CONFIG_H -I. -I./src "-I./src" -pthread -O2 -g -pipe -Wall
> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
> --param=ssp-buffer-size=4
On Wed, 6 Nov 2019, Sebastian Pop via Pcre-dev wrote:
> Maybe we could add this to the ChangeLog?
Done, and committed.
Philip
--
Philip Hazel
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
On Wed, 6 Nov 2019, Zoltán Herczeg wrote:
> Philip, I think you can create another RC.
There is now a new RC here:
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.34-RC2.tar.gz
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.34-RC2.tar.bz2
I have just made available a Release Candidate for PCRE2 10.34 here:
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.34-RC1.tar.gz
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.34-RC1.tar.bz2
https://ftp.pcre.org/pub/pcre/Testing/pcre2-10.34-RC1.tar.zip
NOTE: this is a different FTP site than
On Tue, 15 Oct 2019, Ralf Junker wrote:
> As of SVN revision 1176, pcre2_get_startchar() may return an arbitrary,
> undefined result.
...
> Without further testing, the same problem seems to be present for JIT
> matching at around line 6215.
Thank you for picking this up. I have done the
On Thu, 3 Oct 2019, Sathish Kumar Subramani via Pcre-dev wrote:
> I am trying to build PCRE 8.43 version in my windows platform using Visual
> studio 2013. I am facing error with 'snprintf': identifier not found in
> pcregrep.c file. Could you please provide if any patch available to build
>
On Mon, 26 Aug 2019, I wrote:
> On Sun, 18 Aug 2019, ND via Pcre-dev wrote:
>
> > May be when meet (*ACCEPT) find_minlength must simply drop further
> > calculations for current branch. So the current value of "branchlength" will
> > be immediately considered as a minimum length of whole branch.
On Sun, 18 Aug 2019, ND via Pcre-dev wrote:
> May be when meet (*ACCEPT) find_minlength must simply drop further
> calculations for current branch. So the current value of "branchlength" will
> be immediately considered as a minimum length of whole branch.
That was not easily possible in the
On Sat, 10 Aug 2019, ND via Pcre-dev wrote:
> I would appreciate if at first we reach a consensus on these suggestions
> before make any rewrite of partial matching that you gonna do.
I do not intend to make any more changes to the partial matching code. I
may do further updates to the
The ftp.csx.cam.ac.uk site, from which PCRE has been distributed, has
been discontinued. For various reasons this has happened rather
suddenly, which is why no notice was given here.
However, the site that holds all PCRE releases from 8.00 onwards remains
active, and that is where new releases
On Sat, 27 Jul 2019, ND via Pcre-dev wrote:
> /b(? abc
> 0: ab
><
>
> Why "a" showed as text that was consulted during a successful pattern match,
> but "c" not?
There was a bug. I have fixed it. Thanks for noticing.
Philip
--
Philip Hazel
--
## List details at
On Sat, 27 Jul 2019, ND via Pcre-dev wrote:
> There are some kinds of problems that exist with max_lookbehind:
It was always a hack to try to make is possible to do multi-segment
matching using the normal matching function, something for which it was
not designed.
> - bugs
> - performance
On Sat, 3 Aug 2019, ND via Pcre-dev wrote:
> May be it can be useful to have ability to set a limits of lookbehind search
> for performance reasons.
> I can imagine a rule: If nonfixedlength lookbehind immediately preceded by
> capture group, then it is restricted to start position of this group.
On Mon, 29 Jul 2019, 虚空幻影 via Pcre-dev wrote:
> As follows is my test.
>
> ./pcre2test
>
> PCRE2 version 10.33
>
> 2019-04-16
>
> re> /b(?
> data> abc
>
> No match
>
> data>
>
>
>
> I tried to test your case, but the result is different from yours, why?
You are using 10.33.
On Wed, 31 Jul 2019, Zoltán Herczeg wrote:
> You have already convinced me to drop MOVE :)
> The question is whether we keep the other construct. Or "rematching" a
> capturing block in an assertion like fashion would solve this problem better.
I don't think that would solve the original
On Wed, 31 Jul 2019, Zoltán Herczeg wrote:
> If we consider the following pattern:
> /(*napla:a|a)+/
>
> is the same as:
> /(?:(*napla:a|a))+/
>
> Then we have an empty match if I understand correctly the behavior of
> this new construct.
Oh, sorry, I was thinking of
On Wed, 31 Jul 2019, Zoltán Herczeg wrote:
> You are right. Since you can put it into a group, it is not possible
> to prevent repetitions. However the rule that empty matches break
> (non-fixed) loops may solve this problem.
... but it's not an empty match.
> I start to understand why perl
On Tue, 30 Jul 2019, Zoltán Herczeg wrote:
> > (*MOVE) is a small addition and solves ND's non-atomic assertion
> > requirement. Perhaps we can just start with (*MOVE).
>
> Yes, if we choose this option to implement.
It occurs to me that (*MOVE) gives scope for infinite loops:
On Tue, 30 Jul 2019, Zoltán Herczeg wrote:
> Thinking about practical use cases. With the proposed changes, doing a
> submatch is quite overcomplicated:
>
> (*:A)submatch(*:B)(*MOVE:A)(*SETEND:B)match-submatch-again(*MOVE:B)(*SETEND)
>
> Perhaps the other idea, use capturing brackets for this
On Mon, 29 Jul 2019, Zoltán Herczeg wrote:
> > May be it not quite effective and still have restrictions but is useful.
> > Is it simple to add such functionality?
>
> Definitely not easy in JIT.
Not easy in the interpreter either.
> I have an alternative solution which might be able to solve
On Sat, 27 Jul 2019, ND via Pcre-dev wrote:
> It seems last code unit "c" is not detected and so start optimization don't
> work:
>
>
> PCRE2 version 10.34-RC1 2019-04-22
> /\Aabc/info,auto_callout
> Capture group count = 0
> Max lookbehind = 1
> Compile options: auto_callout
> Overall options:
On Wed, 24 Jul 2019, ND via Pcre-dev wrote:
> In terms of multisegment matching this may be say: partial hard match occurs
> when current segment is not last and it's content not enough to exactly
> determine, what match (or nomatch) would have WHOLE subject from this start
> position.
Yes, more
On Sun, 21 Jul 2019, ND via Pcre-dev wrote:
> /(?![ab]).*/
> ab\=ph
> 0:
>
> /c*+/
> ab\=ph,offset=2
> 0:
The characteristic of these is that the pattern can match an empty
string. I have now added this condition (which was easily done with no
repeated test) and those patterns now give
On Sun, 21 Jul 2019, ND via Pcre-dev wrote:
> New algorithm still have another parts of discussed oversight. For example it
> returns full match instead of partial in following cases:
>
> /(?![ab]).*/
> ab\=ph
> 0:
>
> /c*+/
> ab\=ph,offset=2
> 0:
The answer to that may lie in thinking about
I have just committed a patch that makes some small changes to the way
partial matches are handled in the interpreter. I hope Zoltán will in
due course pick these up for the JIT. (There are new tests at the end of
testinput2 which have no_jit set at the moment.)
The changes are really quite
On Wed, 17 Jul 2019, ND via Pcre-dev wrote:
Let us ignore for the moment whether there should be a new option or
not, and try to figure out what new logic might be needed. I am going to
experiment with the suggestion I made earlier:
If a hard partial match is possible, return PCRE2_PARTIAL if
On Mon, 15 Jul 2019, ND via Pcre-dev wrote:
> This option is added ten years ago EXACTLY for multisegment matching.
> Please read a very first proposal post and thread about it. Thats how
> partial_hard is born:
> https://lists.exim.org/lurker/message/20090524.142622.cb850f3a.en.html
Your memory
On Sat, 13 Jul 2019, I wrote:
> > May be "[^a]" can use the same algorithm as "[^ab]"?
>
> [^a] is optimized into a different (faster) opcode; I will see if this
> can easily produce the same starting code units as [^ab] for tidyness. I
> do not expect it will do much for performance.
Having
On Tue, 16 Jul 2019, ND via Pcre-dev wrote:
> /(*napla:^x|^y)/I
> Capture group count = 0
> May match empty string
> Compile options:
> Overall options: anchored
> Starting code units: x y
> Subject length lower bound = 0
>
> We have starting code unit. Isn't Subject length lower bound must be
On Mon, 15 Jul 2019, I wrote:
> However, there does exist the PCRE2_NOTEOL option. At the moment, this
> is applied only to the $ meta character, not \z or \Z. Perhaps it
> should.
Or perhaps an entirely new option PCRE2_NOTEOS (not end of subject)
should be invented, to stop \z ever
On Sun, 14 Jul 2019, I wrote:
> I am still not entirely convinced this change should be made.
And thinking about it overnight has not changed my mind. Requesting a
partial match was never intended to have the implication "this is not
the end segment".
However, there does exist the
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> At its core \z is positive lookahead assertion that want to inspect next
> character of subject.
I must admit I had not thought of it like that. I considered it just to
be "are we at the end of the subject?".
> I propose following algorithm (for
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> Is there people that successfully compile PCRE2 under MSVC 2019 to tell with
> them?
> Is there detailed doc how compile PCRE2 with MSVC 2019?
I am sorry that I cannot help, but I don't even use Windows, let alone
MSVC. All the information I put in
On Sat, 13 Jul 2019, Zoltán Herczeg wrote:
> Somehow it doesn't feel right to call this new construct as an
> "assertion", which normally checks whether a condition is true. I
> think the nature of this new construct is closer to "script run" which
> adds an extra task after a bracket is matched.
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> Unfortunately PCRE2 svn version is not compiled for me with Microsoft Visual
> Studio 2019 on Windows 7x64.
Can you compile the released source versions? (There shouldn't be any
difference, but I just wondered.)
Philip
--
Philip Hazel
--
##
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> PCRE2_PARTIAL_HARD is intended for multisegment matching. I think when this
> option is set it means: this subject IS incomplete, it's only a non-last part
> of a certain "entire" subject.
It was never intended to mean "this subject is incomplete",
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> PCRE try to detect starting code units in attempt to apply a start
> optimization.
> As we can see from next two examples, it detects starting code units for
> "[^ab]", but don't doing this for "[^a]". I think it looks a bit curiously.
> May be "[^a]"
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
> It seems this example introduce not partial matching but "regular" matching
> bug.
>
> PCRE2 version 10.33 2019-04-16
> /(?<=(?=(?<=a)))b/
> ab
> No match
>
>
> While Perl is correctly match "b".
Probably the same bug as in your previous message.
On Fri, 12 Jul 2019, ND via Pcre-dev wrote:
> > > > >PCRE2 version 10.33 2019-04-16
> > > >/(?<=(?=.(?<=x)))/
> > > >ab\=ph
> > > >Partial match: b
> > > > Why it matched b?
> >Again, it has inspected at least one character, and if you add "x" itmatches.
>
> But I not try to add x. I inspect
On Tue, 9 Jul 2019, I wrote:
> I have put this on the wish list, but until I look at the code, I have
> no idea whether it will be easy or straightforward to implement in the
> interpreter. I will try to investigate soon. If it turns out to be
> possible in the interpreter, it will up to
On Fri, 12 Jul 2019, ND via Pcre-dev wrote:
> This is about my second example.
> But it seems first example have another issue:
>
> >PCRE2 version 10.33 2019-04-16
> >/(?<=(?=.(?<=x)))/
> >ab\=ph
> >Partial match: b
>
> Why it matched b?
Again, it has inspected at least one character, and if
On Thu, 11 Jul 2019, ND via Pcre-dev wrote:
> I guess you told about second example (in first example "x" don't adds). I
> believed empty match at the end of string is not counted as partial.
This is a documentation issue. Instead of "empty match" read "match in
which no characters are
On Wed, 10 Jul 2019, ND via Pcre-dev wrote:
> PCRE2 version 10.33 2019-04-16
> /(?<=(?=.(?<=x)))/
> ab\=ph
> Partial match: b
>
>
> /(?<=.(?=x))/
> ab\=ph
> Partial match: b
> <
>
> Isn't both results should be "no match" instead of "partial match"?
Why? "Partial match" means
On Wed, 10 Jul 2019, Zoltán Herczeg wrote:
> > /(?0)/
> As far as I remember, these are detected by the parser.
Some of them are detected by the parser in PCRE1, but not all of them,
so there is a runtime check. Looks like I decided to leave it all to
runtime in PCRE2. The error message "nested
On Mon, 8 Jul 2019, ND via Pcre-dev wrote:
> And if we disregards Perl's bugs then it seems (*COMMIT) in Perl works in a
> following manner:
>
> 1. Backtracking can't move to the left of COMMIT (this is PCRE behaviour too)
> 2. If COMMIT occurs then no advance match to any other position of
On Sun, 7 Jul 2019, ND via Pcre-dev wrote:
> If it's simple to add a Non-atomic positive lookaheads then how are you about
> put it to PCRE wishlist please.
> It can be looks like
> (*non_atomic_positive_lookahead:...)
> (*napla:...)
I have put this on the wish list, but until I look at the
On Tue, 2 Jul 2019, ND via Pcre-dev wrote:
> It seems a Perl is so buggy or have really different conception of (*COMMIT)
> then PCRE.
I am waiting for further information from the Perl developers, but I
suspect that I won't want to change PCRE2, except perhaps to add more
detail to the
On Tue, 2 Jul 2019, I wrote:
> > PCRE2 version 10.33 2019-04-16
> > /\A(?:.(*COMMIT))*c/
> > abcd
> > No match
> >
> > But Perl reports that this is successful match "abc".
>
> I think this is also a Perl bug and I will report it.
A Perl developer has admitted there is some ambiguity, but
On Tue, 2 Jul 2019, Zoltán Herczeg wrote:
> Perhaps the misunderstanding comes from the fact that we are talking
> about the pattern and they talk about the matching process. So (*THEN)
> simply starts a backtrack, and when an alternation is encountered, it
> switches to the next alternative.
On Tue, 2 Jul 2019, Zoltán Herczeg wrote:
> If you are right about the internal working of (*THEN), then this verb
> has a very unclear and inconsistent behavior, which is very hard to
> track for a user.
And it totally contradicts the Perl documentation, in particular, this
sentence:
Note
On Mon, 1 Jul 2019, ND via Pcre-dev wrote:
> As you participate in Perl regex development can you take a look at another
> Perl bug please:
I do not participate in Perl regex development. I just report bugs when
I find them, using the perlbug command. You could do this yourself. (And
you seem
On Sun, 30 Jun 2019, ND via Pcre-dev wrote:
> PCRE2 version 10.33 2019-04-16
> /\A(?:.|..)(*THEN)c/
> abc
> No match
>
>
> Perl is match "abc".
> I suppose "next innermost alternative" is interpreted differently by PCRE and
> Perl.
>
> If so, may be PCRE should go Perl way in this matter?
I
On Sun, 23 Jun 2019, I wrote:
> I woke up in the middle of last night with an idea as to how it could
> easily be made better, but I haven't looked at the code yet. I am busy
> with other things today and tomorrow, but then I will see if my midnight
> bright idea actually works.
I have
On Sun, 23 Jun 2019, ND via Pcre-dev wrote:
> On 2019-06-23 04:33, ND wrote:
> >Or this calculations occurs at compile time while partial matching flag is
> >set at matchtime?
That is correct.
> Oh! Now I read docs about it.
> It seems that PARTIAL are compiletime option only for JIT. So it
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> It may be less unusial if we use a simple assertions:
That is probably what most people do.
> I agree that max lookbehind value corresponds to docs.
> But this is not an end in itself. We keep in mind that max lookbehind value
> calculation intended
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> PCRE2 version 10.33 2019-04-16
> /(?<=(?<=a)b)c.*/info
> Capture group count = 0
> Max lookbehind = 1
> First code unit = 'c'
> Subject length lower bound = 1
> abc\=ph
> Partial match: bc
> <
>
> Why max lookbehind=1, but not 2?
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> Sorry for my bad English.
> I need to find word that is closest to the end of text and occurs at least 10
> times in that text.
Yes, I understand that now. I will think about it.
Philip
--
Philip Hazel
--
## List details at
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> >If (*SKIP) is used inside a lookbehind to specify a new starting
> >position...
>
> I suggest to remove "inside a lookbehind".
> A new starting position that is not later than the starting point of the
> current match may occur without lookbehind:
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> Your example is not working right (let's change 10 to 3 for simplicity):
>
> /\A.*\b(\w++)(?>.*?\b\1\b){2}/
> word1 word1 word2 word2 word2 word1
> 0: word1 word1 word2 word2 word2
> 1: word2
>
> We want to capture "word1" as most closer to the end
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> /\A(?:a|(?=b)|.){50}\z/
> abc
> 0: abc
>
> when engine in a strange way decides that it was exactly 50 repetitions.
That is not an unlimited repeat, so there is no special action for
matching an empty string. Therefore, (?=b) matches 47 times. A
On Fri, 21 Jun 2019, ND via Pcre-dev wrote:
> Imagine that we have a text. There are some words in this text that occurs at
> least 10 times. We want to find from they a word that is most closer to the
> end of text.
>
> If lookahead assertion is non-possessive then we can use this pattern:
>
>
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
> Successfull match of "X*\z" means that PCRE says: X CAN be successfully
> repeated until the very end of subject (let's the match is "abc" for example).
> When we use "X*" we want to say: repeat X as much as it can.
Yes, but there is special
On Mon, 17 Jun 2019, ND via Pcre-dev wrote:
> Chapter ISSUES WITH MULTI-SEGMENT MATCHING of pcre2partial.html includes item
> 2 with description how to process with lookbehind assertions.
>
> I think it's important to add to this algorithm a some words about "no match":
> If result of partial
On Wed, 19 Jun 2019, ND via Pcre-dev wrote:
> >At present, lookarounds do not take part in minimum length calculations,
>
> I see lookarounds takes part: first and last code units are searched in
> lookarounds too.
I wasn't quite precise. Lookarounds are not scanned when computing
a minimum
On Mon, 17 Jun 2019, ND via Pcre-dev wrote:
> In pcre2test docs in chapter RESTARTING AFTER A PARTIAL MATCH there is
> example:
>
> data> 23ja\=P,dfa
>
> What matching option "P" is? May be it should be corrected to "ph" or "ps"?
Thank you. Yes, that should be "ps". This is a hangover from
On Mon, 17 Jun 2019, ND via Pcre-dev wrote:
> I don't find in docs behaviour of SKIP when corresponding position is before
> or equal start_offset.
> It seems that in this case a "bumpalong" advance is 1, not SKIP or associated
> MARK position.
Yes, that is true. The code contains this comment:
On Sun, 16 Jun 2019, ND via Pcre-dev wrote:
> PCRE2 version 10.33 2019-04-16
> /(?:a|(?=b)|.)*\z/
> abc
> 0: abc
>
> May be docs need some clarification about what happened at that point.
> After lookahead assertion (?=b) matches, loop is not broken. It seems a
> backtracking occurs as if group
On Sun, 16 Jun 2019, ND via Pcre-dev wrote:
> A following example was included in docs (pcre2pattern.html) :
>
> A(*ACCEPT)??BC
>
> But this example does not show what we can do with (*ACCESS)?? that can't
> doing well with another PCRE facilities.
> I suggest to show in docs another example
On Thu, 20 Jun 2019, Zoltán Herczeg wrote:
> > (?=x|y) looks much more ergonomical than (?:(?=x)|(?=y))
>
> They behave the same way, so pick whatever you prefer.
(?:(?=X)|(?=Y))Z means "if X matches, try to match Z; if that fails, if
Y matches try to match Z". In the simple case the second
1 - 100 of 484 matches
Mail list logo