Re: [Toybox] [PATCH] env: fix case where a variable is replaced.

2019-05-03 Thread enh via Toybox
Thanks, it's there now. I'll see how that goes...

*From:*Rob Landley 
*Date:*Fri, May 3, 2019, 13:29
*To:*enh, toybox

Forgot to push, sorry.
>
> Rob
>
> On 5/3/19 2:47 PM, enh via Toybox wrote:
> > ping?
> >
> > On Thu, May 2, 2019 at 6:47 PM enh  wrote:
> >>
> >> Found when trying to update the toybox prebuilt used for the Android
> >> build.
> >>
> >> Also add the corresponding test.
> >> ---
> >>  lib/env.c  | 4 +---
> >>  tests/env.test | 2 ++
> >>  2 files changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/lib/env.c b/lib/env.c
> >> index 35ef688c..bc23b753 100644
> >> --- a/lib/env.c
> >> +++ b/lib/env.c
> >> @@ -58,11 +58,9 @@ void xsetenv(char *name, char *val)
> >>  if (!memcmp(name, environ[i], len) && environ[i][len]=='=') {
> >>if (i>=envc) free(environ[i]);
> >>else {
> >> -char **delete = environ+i;
> >> -
> >>  // move old entries down, add at end of old data
> >>  toys.envc = envc--;
> >> -for (i=0; new ? i delete[i+1];
> >> +for (; new ? i environ[i+1];
> >>  i = envc;
> >>}
> >>break;
> >> diff --git a/tests/env.test b/tests/env.test
> >> index 3098731d..4df118dd 100755
> >> --- a/tests/env.test
> >> +++ b/tests/env.test
> >> @@ -19,3 +19,5 @@ unset WALRUS BANANA LETTERS FILTER
> >>
> >>  testcmd "early fail" '--oops 2> /dev/null ; echo $?' "125\n" "" ""
> >>  testcmd "why is this allowed" "=BLAH env | grep '^=BLAH\$'" "=BLAH\n"
> "" ""
> >> +
> >> +testcmd "replace" "A=foo PATH= `which printenv` A" "foo\n" "" ""
> >> --
> >> 2.21.0.1020.gf2820cf01a-goog
> > ___
> > Toybox mailing list
> > Toybox@lists.landley.net
> > http://lists.landley.net/listinfo.cgi/toybox-landley.net
> > .
> >
>
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] env: fix case where a variable is replaced.

2019-05-03 Thread Rob Landley
Forgot to push, sorry.

Rob

On 5/3/19 2:47 PM, enh via Toybox wrote:
> ping?
> 
> On Thu, May 2, 2019 at 6:47 PM enh  wrote:
>>
>> Found when trying to update the toybox prebuilt used for the Android
>> build.
>>
>> Also add the corresponding test.
>> ---
>>  lib/env.c  | 4 +---
>>  tests/env.test | 2 ++
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/env.c b/lib/env.c
>> index 35ef688c..bc23b753 100644
>> --- a/lib/env.c
>> +++ b/lib/env.c
>> @@ -58,11 +58,9 @@ void xsetenv(char *name, char *val)
>>  if (!memcmp(name, environ[i], len) && environ[i][len]=='=') {
>>if (i>=envc) free(environ[i]);
>>else {
>> -char **delete = environ+i;
>> -
>>  // move old entries down, add at end of old data
>>  toys.envc = envc--;
>> -for (i=0; new ? i> +for (; new ? i>  i = envc;
>>}
>>break;
>> diff --git a/tests/env.test b/tests/env.test
>> index 3098731d..4df118dd 100755
>> --- a/tests/env.test
>> +++ b/tests/env.test
>> @@ -19,3 +19,5 @@ unset WALRUS BANANA LETTERS FILTER
>>
>>  testcmd "early fail" '--oops 2> /dev/null ; echo $?' "125\n" "" ""
>>  testcmd "why is this allowed" "=BLAH env | grep '^=BLAH\$'" "=BLAH\n" "" ""
>> +
>> +testcmd "replace" "A=foo PATH= `which printenv` A" "foo\n" "" ""
>> --
>> 2.21.0.1020.gf2820cf01a-goog
> ___
> Toybox mailing list
> Toybox@lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
> .
> 
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] man: support MANPATH.

2019-05-03 Thread Rob Landley
On 4/29/19 7:36 PM, makepost wrote:
> On Mon, Apr 29, 2019 at 04:33:07PM -0700, enh via Toybox wrote:
> +  if (!TT.M) TT.M = getenv("MANPATH");
> 
> Breaks unfortunately, distros e.g. gentoo are prefixing the main path in
> the variable with colon divided binutils and gcc dirs, apparently for
> allowing to install multiple versions and architectures on same system.
> I can rework tryfile and FLAG(k) from yesterday's patch to walk every
> directory from this variable, would that be viable?

Sorry, I still haven't moved off gmail and this got caught in gmail's out of
control spam filter again.

I've got some colon separated path walking code (lib.c find_in_path()), I should
probably try to genericize it somehow to cover this case better. Hmmm...
(Possibly teach _it_ to wordexp()? Or a more generic callback...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Android moved the scheduler policy functions in Q.

2019-05-03 Thread enh via Toybox
On Fri, May 3, 2019 at 11:41 AM Rob Landley  wrote:
>
> On 5/2/19 12:37 PM, enh wrote:
> > On Thu, May 2, 2019 at 9:35 AM Rob Landley  wrote:
> >>
> >> On 5/1/19 5:40 PM, enh via Toybox wrote:
> >>> They're forwarded to libprocessgroup, but we may as well go straight to
> >>> the source since neither library is in the NDK anyway.
> >>>
> >>> This code is unfortunate because it means that even `toybox true` ends
> >>> up pulling in a JSON parser at runtime, because ps might call
> >>> get_sched_policy/get_sched_policy_name. I'll experiment with
> >>> dlopen-on-demand in portability.c and see whether the savings are
> >>> worthwhile, but for now at least use the current library directly so we
> >>> can save *one* dlopen!
> >>
> >> I'm as-needed in the linker but glibc defeated that by having librt pull in
> >> libpthread when clock_gettime() has been part of glibc proper since the 
> >> 2.17
> >> release ~7 years ago.
> >
> > does as-needed actually help when you're built RELRO? certainly
> > Android does everything up front for RELRO (which has been mandatory
> > for years). i don't know what glibc does, but i do know that my linux
> > boxes are at least using partial RELRO these days.
>
> I have no idea, compilers and optimization are moving targets, and I just
> updated off of Ubuntu April 2014 edition last month. :)
>
> That said, a V=1 build for defconfig on devuan has:
>
>   -Wl,--as-needed -lutil -lcrypt -lm -lresolv -lz
>
> resulting in:
>
> $ ldd toybox
> linux-vdso.so.1
> libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> /lib64/ld-linux-x86-64.so.2
>
> So it's dropping out zlib, libresolv, libm, and libutil since it didn't
> compile-time resolve anything out of them, so with the default flags at least 
> it
> still works?
>
> Lemme look up what relro does...
>
> https://www.redhat.com/en/blog/hardening-elf-binaries-using-relocation-read-only-relro
>
> Ah. It's the opposite of lazy binding, mark PLT and GOT read only once they've
> been resolved. Yeah, pretty obvious. That should mostly just be a question for
> the dynamic linker, not the compile-time linker, I'd think? I don't see why
> --as-needed at compile time would stop working, since that's _resolved_ at
> compile time. (It says don't generate the ELF references for libraries it 
> didn't
> need any symbols from.)

yeah, i realized that after hitting send yesterday, but forgot to come
back. --as-needed isn't relevant for my case at all.

> >> Meanwhile, Android seem to have reinvented the microkernel again and turned
> >> everything in the world into message passing:
> >>
> >> https://security.googleblog.com/2019/03/open-sourcing-sandboxed-api.html
> >
> > (fwiw, i think that's Chrome. certainly nothing to do with us.)
>
> Oh good. (I'd eventually like to make _that_ support a development environment
> too, but one bite of the elephant at a time...)
>
> Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] env: fix case where a variable is replaced.

2019-05-03 Thread enh via Toybox
(long version: the version of tar in Q is a mess, so even at this late
stage i'm tempted to try to sync with AOSP master, but i can't move
forward because env is broken.)

On Fri, May 3, 2019 at 12:47 PM enh  wrote:
>
> ping?
>
> On Thu, May 2, 2019 at 6:47 PM enh  wrote:
> >
> > Found when trying to update the toybox prebuilt used for the Android
> > build.
> >
> > Also add the corresponding test.
> > ---
> >  lib/env.c  | 4 +---
> >  tests/env.test | 2 ++
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/env.c b/lib/env.c
> > index 35ef688c..bc23b753 100644
> > --- a/lib/env.c
> > +++ b/lib/env.c
> > @@ -58,11 +58,9 @@ void xsetenv(char *name, char *val)
> >  if (!memcmp(name, environ[i], len) && environ[i][len]=='=') {
> >if (i>=envc) free(environ[i]);
> >else {
> > -char **delete = environ+i;
> > -
> >  // move old entries down, add at end of old data
> >  toys.envc = envc--;
> > -for (i=0; new ? i > +for (; new ? i >  i = envc;
> >}
> >break;
> > diff --git a/tests/env.test b/tests/env.test
> > index 3098731d..4df118dd 100755
> > --- a/tests/env.test
> > +++ b/tests/env.test
> > @@ -19,3 +19,5 @@ unset WALRUS BANANA LETTERS FILTER
> >
> >  testcmd "early fail" '--oops 2> /dev/null ; echo $?' "125\n" "" ""
> >  testcmd "why is this allowed" "=BLAH env | grep '^=BLAH\$'" "=BLAH\n" "" ""
> > +
> > +testcmd "replace" "A=foo PATH= `which printenv` A" "foo\n" "" ""
> > --
> > 2.21.0.1020.gf2820cf01a-goog
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] env: fix case where a variable is replaced.

2019-05-03 Thread enh via Toybox
ping?

On Thu, May 2, 2019 at 6:47 PM enh  wrote:
>
> Found when trying to update the toybox prebuilt used for the Android
> build.
>
> Also add the corresponding test.
> ---
>  lib/env.c  | 4 +---
>  tests/env.test | 2 ++
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/env.c b/lib/env.c
> index 35ef688c..bc23b753 100644
> --- a/lib/env.c
> +++ b/lib/env.c
> @@ -58,11 +58,9 @@ void xsetenv(char *name, char *val)
>  if (!memcmp(name, environ[i], len) && environ[i][len]=='=') {
>if (i>=envc) free(environ[i]);
>else {
> -char **delete = environ+i;
> -
>  // move old entries down, add at end of old data
>  toys.envc = envc--;
> -for (i=0; new ? i +for (; new ? i  i = envc;
>}
>break;
> diff --git a/tests/env.test b/tests/env.test
> index 3098731d..4df118dd 100755
> --- a/tests/env.test
> +++ b/tests/env.test
> @@ -19,3 +19,5 @@ unset WALRUS BANANA LETTERS FILTER
>
>  testcmd "early fail" '--oops 2> /dev/null ; echo $?' "125\n" "" ""
>  testcmd "why is this allowed" "=BLAH env | grep '^=BLAH\$'" "=BLAH\n" "" ""
> +
> +testcmd "replace" "A=foo PATH= `which printenv` A" "foo\n" "" ""
> --
> 2.21.0.1020.gf2820cf01a-goog
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] FYI: kernel builds rely on GNU expr extensions

2019-05-03 Thread enh via Toybox
On Fri, May 3, 2019 at 11:28 AM Rob Landley  wrote:
>
> On 5/2/19 4:20 PM, enh wrote:
> > (fwiw, with an instrumented build of the 4.19 Android common kernel, i
> > only see `expr 4 * 65536 + 019 * 256 + 037`. so i don't know what
> > they're _really_ doing to hit this.)
>
> Back under Aboriginal Linux what I'd do is prepare two parallel package build
> folders, one with the new environment and one with the old environment, then 
> run
> a UP build with "2>&1 | tee out.txt" (no parallelism so no log jitter and 
> where
> it stops is the first time it noticed the problem), then rename that directory
> and name the second directory in its place (so both builds have the same
> absolute path to reduce log diff) and build the other one, and then diff both
> entire trees and swear about all the false positives caused by the clock
> changing and that making its way into binary output and/or logs. Oh, and V=1 
> in
> kernel builds, obviously...
>
> This sort of thing was often a multi-day dig. I need to build up mkroot so 
> it's
> building as many things as aboriginal linux used to, but have yet to circle my
> way back around to that...
>
> Where's the git repo and .config for this kernel they're trying to build when
> they hit this?

no idea. i'm playing telephone --- it's not true for the Android common kernel.

i'll let you know if i find out what's really going on.

> Rob
>
> P.S. Yeah, I should start the Giant AOSP download onto the new machine when I
> get back to the cable modem after lunch. 16 gigs ram and 2 terabyte drive,
> should _finally_ have space for it. Only an i5-3340M processor though, It's 2X
> SMP @2.7 ghz hyperthreaded to 4X, probably a 24 hour build time for the whole
> image...
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread enh via Toybox
On Fri, May 3, 2019 at 11:59 AM Rob Landley  wrote:
>
> On 5/3/19 1:56 PM, Rob Landley wrote:
> > On 5/3/19 1:05 PM, enh wrote:
> > But yeah, the new pessimal case after the change I'm making now would be a
> > megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output buffer
> > thing...
>
> And it can be avoided by an in-place copy that remembers "last gap" and 
> doesn't
> copy trailing data until the _next_ hit (or we confirm there's no hit) so we
> know how much of the skipped data is "confirmed" and only copy it once we know
> we're keeping it.

yeah, that's what i meant by only ensuring that the first n characters
are right.

> So every kept byte is only copied once, and it can still be done in place for
> the shrinking case. (You'll have to realloc when you expand, but that's 
> probably
> unavoidable...)
>
> Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 1:05 PM, enh wrote:
> BSD seems to avoid all the copying? i think they have a "source" and
> "destination" and only write to the latter (solving your issue), but
> only move forwards (so i don't think they try to maintain it as "what
> the whole result would look like if we only did this many
> replacements", rather "here's what the first n characters of the
> result will look like").

Oh I know it's solvable. Working on it.

The s/x/y/g case A) isn't changing the length of the string, B) has no backrefs.
So it doesn't need the copying, yes.

Mine only moves forwards too: line and len are the start of the line and the
whole line's length, but rline and rlen are the remaining line and remaining
length. I just responded to the "don't overwrite backref data while still using
it" problem by copying the whole line, when I didn't need to and it opens this
pathological case. Hence last email on what I _should_ do instead.

(Hmmm... the "one input buffer one output buffer" thing optimizes for the
"replace with shorter string" case where you avoid repeated memmove() on the
tail of the string. But it slightly pessimizes the cases where you _don't_ and
realloc() a lot. It's also a bigger change to the code, lemme see how much this
does first...)

But yeah, the new pessimal case after the change I'm making now would be a
megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output buffer
thing...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 1:56 PM, Rob Landley wrote:
> On 5/3/19 1:05 PM, enh wrote:
> But yeah, the new pessimal case after the change I'm making now would be a
> megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output buffer
> thing...

And it can be avoided by an in-place copy that remembers "last gap" and doesn't
copy trailing data until the _next_ hit (or we confirm there's no hit) so we
know how much of the skipped data is "confirmed" and only copy it once we know
we're keeping it.

So every kept byte is only copied once, and it can still be done in place for
the shrinking case. (You'll have to realloc when you expand, but that's probably
unavoidable...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Android moved the scheduler policy functions in Q.

2019-05-03 Thread Rob Landley
On 5/2/19 12:37 PM, enh wrote:
> On Thu, May 2, 2019 at 9:35 AM Rob Landley  wrote:
>>
>> On 5/1/19 5:40 PM, enh via Toybox wrote:
>>> They're forwarded to libprocessgroup, but we may as well go straight to
>>> the source since neither library is in the NDK anyway.
>>>
>>> This code is unfortunate because it means that even `toybox true` ends
>>> up pulling in a JSON parser at runtime, because ps might call
>>> get_sched_policy/get_sched_policy_name. I'll experiment with
>>> dlopen-on-demand in portability.c and see whether the savings are
>>> worthwhile, but for now at least use the current library directly so we
>>> can save *one* dlopen!
>>
>> I'm as-needed in the linker but glibc defeated that by having librt pull in
>> libpthread when clock_gettime() has been part of glibc proper since the 2.17
>> release ~7 years ago.
> 
> does as-needed actually help when you're built RELRO? certainly
> Android does everything up front for RELRO (which has been mandatory
> for years). i don't know what glibc does, but i do know that my linux
> boxes are at least using partial RELRO these days.

I have no idea, compilers and optimization are moving targets, and I just
updated off of Ubuntu April 2014 edition last month. :)

That said, a V=1 build for defconfig on devuan has:

  -Wl,--as-needed -lutil -lcrypt -lm -lresolv -lz

resulting in:

$ ldd toybox
linux-vdso.so.1
libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
/lib64/ld-linux-x86-64.so.2

So it's dropping out zlib, libresolv, libm, and libutil since it didn't
compile-time resolve anything out of them, so with the default flags at least it
still works?

Lemme look up what relro does...

https://www.redhat.com/en/blog/hardening-elf-binaries-using-relocation-read-only-relro

Ah. It's the opposite of lazy binding, mark PLT and GOT read only once they've
been resolved. Yeah, pretty obvious. That should mostly just be a question for
the dynamic linker, not the compile-time linker, I'd think? I don't see why
--as-needed at compile time would stop working, since that's _resolved_ at
compile time. (It says don't generate the ELF references for libraries it didn't
need any symbols from.)

>> Meanwhile, Android seem to have reinvented the microkernel again and turned
>> everything in the world into message passing:
>>
>> https://security.googleblog.com/2019/03/open-sourcing-sandboxed-api.html
> 
> (fwiw, i think that's Chrome. certainly nothing to do with us.)

Oh good. (I'd eventually like to make _that_ support a development environment
too, but one bite of the elephant at a time...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] FYI: kernel builds rely on GNU expr extensions

2019-05-03 Thread Rob Landley
On 5/2/19 4:20 PM, enh wrote:
> (fwiw, with an instrumented build of the 4.19 Android common kernel, i
> only see `expr 4 * 65536 + 019 * 256 + 037`. so i don't know what
> they're _really_ doing to hit this.)

Back under Aboriginal Linux what I'd do is prepare two parallel package build
folders, one with the new environment and one with the old environment, then run
a UP build with "2>&1 | tee out.txt" (no parallelism so no log jitter and where
it stops is the first time it noticed the problem), then rename that directory
and name the second directory in its place (so both builds have the same
absolute path to reduce log diff) and build the other one, and then diff both
entire trees and swear about all the false positives caused by the clock
changing and that making its way into binary output and/or logs. Oh, and V=1 in
kernel builds, obviously...

This sort of thing was often a multi-day dig. I need to build up mkroot so it's
building as many things as aboriginal linux used to, but have yet to circle my
way back around to that...

Where's the git repo and .config for this kernel they're trying to build when
they hit this?

Rob

P.S. Yeah, I should start the Giant AOSP download onto the new machine when I
get back to the cable modem after lunch. 16 gigs ram and 2 terabyte drive,
should _finally_ have space for it. Only an i5-3340M processor though, It's 2X
SMP @2.7 ghz hyperthreaded to 4X, probably a 24 hour build time for the whole
image...
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread enh via Toybox
BSD seems to avoid all the copying? i think they have a "source" and
"destination" and only write to the latter (solving your issue), but
only move forwards (so i don't think they try to maintain it as "what
the whole result would look like if we only did this many
replacements", rather "here's what the first n characters of the
result will look like").

On Fri, May 3, 2019 at 11:02 AM Rob Landley  wrote:
>
> On 5/3/19 12:40 PM, Rob Landley wrote:
> > On 5/2/19 9:46 PM, enh via Toybox wrote:
> >> i've known about this for a couple of days and haven't had time to
> >> look at it properly yet, so i should mention it here...
> >>
> >> if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g',
> >> BSD or GNU sed finishes immediately, but toybox takes forever.
> >
> > Because it allocates a replacement string and copies the 
> > before/replace/after
> > parts into it for each replacement. I should switch the xmalloc() on line 
> > 515 to
> > xrealloc() and fiddle with the surrounding code not to do unnecessary 
> > work...
>
> And luckily I left myself a comment about why I _didn't_ do that:
>
>   // Allocate new size, copy start/end around match. (Can't extend in
>   // place because backrefs may refer to text after it's overwritten.)
>
> I can realloc() if there were no backrefs, or I can make a temp copy of _just_
> the removed text (since all the backrefs have to be within match[0])...
>
> Hmmm... it's easy to record whether there were backrefs since the code right
> above there measures the new length including backrefs, so I can have it set a
> flag. The embedded newline support does NOT include matching a NUL (because 
> that
> would involve rewriting the regex library) so there can't be a NUL _within_ 
> the
> match so I can just conditionally xstrndup()...
>
> It's a bit fiddly, but not too bad so far...
>
> Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 12:40 PM, Rob Landley wrote:
> On 5/2/19 9:46 PM, enh via Toybox wrote:
>> i've known about this for a couple of days and haven't had time to
>> look at it properly yet, so i should mention it here...
>>
>> if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g',
>> BSD or GNU sed finishes immediately, but toybox takes forever.
> 
> Because it allocates a replacement string and copies the before/replace/after
> parts into it for each replacement. I should switch the xmalloc() on line 515 
> to
> xrealloc() and fiddle with the surrounding code not to do unnecessary work...

And luckily I left myself a comment about why I _didn't_ do that:

  // Allocate new size, copy start/end around match. (Can't extend in
  // place because backrefs may refer to text after it's overwritten.)

I can realloc() if there were no backrefs, or I can make a temp copy of _just_
the removed text (since all the backrefs have to be within match[0])...

Hmmm... it's easy to record whether there were backrefs since the code right
above there measures the new length including backrefs, so I can have it set a
flag. The embedded newline support does NOT include matching a NUL (because that
would involve rewriting the regex library) so there can't be a NUL _within_ the
match so I can just conditionally xstrndup()...

It's a bit fiddly, but not too bad so far...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/2/19 9:46 PM, enh via Toybox wrote:
> i've known about this for a couple of days and haven't had time to
> look at it properly yet, so i should mention it here...
> 
> if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g',
> BSD or GNU sed finishes immediately, but toybox takes forever.

Because it allocates a replacement string and copies the before/replace/after
parts into it for each replacement. I should switch the xmalloc() on line 515 to
xrealloc() and fiddle with the surrounding code not to do unnecessary work...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net