i think you might be able to use REG_STARTEND to avoid the implicit strlen: https://www.freebsd.org/cgi/man.cgi?query=regex&sektion=3&manpath=freebsd-release-ports
REG_STARTEND The string is considered to start at string + pmatch[0].rm_so and to end before the byte located at string + pmatch[0].rm_eo, regardless of the value of nmatch. See below for the definition of pmatch and nmatch. This is an extension, compatible with but not specified by IEEE Std 1003.2 (``POSIX.2''), and should be used with cau- tion in software intended to be portable to other systems. Without REG_NOTBOL, the position rm_so is considered the beginning of a line, such that `^' matches before it, and the beginning of a word if there is a word character at this position, such that `[[:<:]]' and `\<' match before it. With REG_NOTBOL, the character at position rm_so is treated as the continuation of a line, and if rm_so is greater than 0, the preceding character is taken into consideration. If the preceding character is a newline and the regular expression was compiled with REG_NEWLINE, `^' matches before the string; if the preceding character is not a word character but the string starts with a word character, `[[:<:]]' and `\<' match before the string. From: Rob Landley <r...@landley.net> Date: Sun, May 5, 2019 at 9:41 AM To: enh Cc: toybox > On 5/3/19 2:42 PM, enh wrote: > > On Fri, May 3, 2019 at 11:59 AM Rob Landley <r...@landley.net> wrote: > >> > >> On 5/3/19 1:56 PM, Rob Landley wrote: > >>> On 5/3/19 1:05 PM, enh wrote: > >>> But yeah, the new pessimal case after the change I'm making now would be a > >>> megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output > >>> buffer > >>> thing... > >> > >> And it can be avoided by an in-place copy that remembers "last gap" and > >> doesn't > >> copy trailing data until the _next_ hit (or we confirm there's no hit) so > >> we > >> know how much of the skipped data is "confirmed" and only copy it once we > >> know > >> we're keeping it. > > > > yeah, that's what i meant by only ensuring that the first n characters > > are right. > > > >> So every kept byte is only copied once, and it can still be done in place > >> for > >> the shrinking case. (You'll have to realloc when you expand, but that's > >> probably > >> unavoidable...) > > Which I just did (not even trying in-place, everything gets copied once to a > new > string), and it's still way too slow. > > Time to dig out https://jvns.ca/perf-zine.pdf and see where it's spending its > time... Sigh. I forgot quite how much I hate perf, but: > > 68.60% sed sed [.] regexec0 > 29.93% sed [kernel.kallsyms] [k] nmi > 1.47% sed libc-2.24.so [.] strlen > 0.00% sed [kernel.kallsyms] [k] __acct_update_integrals > > It's regexec0(). Hmmmm... ah, my NUL wrapper is essentially calling strlen() > on > the string before doing the regex each time (to figure out where the null to > skip to next is), that should go _after_ it tries to match this segment... > > Ok, and now that I've fixed that it's _still_ taking forever and strlen() is > dominating, but regexec0() is basically calling regexec() as the first thing > and > returning immediately when it hits? The only thing that could be calling > strlen() is libc's regexec()? Hmmm... > > @@ -473,11 +473,15 @@ static void sed_line(char **pline, long plen) > mlen, off, newlen; > > // Loop finding match in remaining line (up to remaining len) > - while (!regexec0(reg, rline, len-(rline-line), 10, match, mflags)) { > +// while (!regexec0(reg, rline, len-(rline-line), 10, match, mflags)) { > +while (!strncmp(rline, "x", 1)) { > + match[0].rm_eo = 1; > + match[1].rm_so = 0; > + > > And _that_ completes a megabyte input line basically instantly. It's regexec() > calling strlen() each time pulling (n^2)/2 bytes (half a terabyte?) through > the > l1 cache that's taking up all the time now. > > Grrr. So I either have to write my own regex plumbing (which I can do, I've > done > it before years ago, but I really didn't _want_ to here because LIBC HAS ONE), > or I need to make my wrapper special case literal matches, which means they > don't have... let's see... (For the honor of <strike>greyskull</strike> "man 7 > regex"...) > > ^|.[*+?(\{$ > > Sigh. If I _was_ going to write my own regex plumbing I could make it take a > length and properly match NUL bytes. On the one hand that's really a post-1.0 > todo item. On the other hand we're talking maybe 200 lines of code here, _and_ > it could probably speed up grep (I could go back to searching for multiple > regexes in parallel... What _is_ involved in implementing lex, anyway...) > > Rob _______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net