On 2019-08-01 08:20, Zoltán Herczeg wrote:
If we would use your idea for doing (0,n-1) match, that could be too
slow for large subject, and people would complain.
Yes, it could be slow. But:
- we can use [\G,n-1]
- we can honestly warn about it in docs. Performance of X(?<=Y) will be
roughly comparable with "X(?=.*?Y\z)" isn't it?
Before we chose anything to implement, it would be good to know about
the problems we want to solve. Especially whether we can solve them with
the current construct. I mean you can always construct artificial use
cases for certain features, but are they necessary or you can solve them
other ways.
I was faced with a need of nonfixed length lookbehind two times:
1. when data came by stream of 24kB blocks and I need to find a last
numeric in each of it
/.{24000}(?<=(\d++)\D*+)/g
2. when I have a json-array file and want to find every top-level element
that have "id" tag at any nested level
/(\{(?:[^{}]++|(?1))*+\})(?<=\{"id":"(?>.*?").*)/g
It is also frequent that you combine regexes with other script
languages. For example you split a string first into records, and do
some search in each record, rather than trying everything with one
complicated regex.
1. Combine is sometimes a bad thing in performance point of view
2. There are cases where there is no no programming language available for
user, only regex. And exactly this case is in one of my application.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev