Re: [Toybox] pathological case in sed s///g

2019-05-08 Thread Rob Landley
I emailed Michael Kerrisk to update the man page to mention it... Rob On 5/8/19 7:52 PM, enh wrote: > (i checked and macOS does have both REG_STARTEND and REG_PEND.) > > From: Rob Landley > Date: Mon, May 6, 2019 at 10:42 AM > To: enh > Cc: toybox, Rich Felker > >> Huh... I'll assume

Re: [Toybox] pathological case in sed s///g

2019-05-08 Thread enh via Toybox
(i checked and macOS does have both REG_STARTEND and REG_PEND.) From: Rob Landley Date: Mon, May 6, 2019 at 10:42 AM To: enh Cc: toybox, Rich Felker > Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at > it. > It's not in the regex man page but it _is_ in the glibc

Re: [Toybox] pathological case in sed s///g

2019-05-07 Thread Rich Felker
On Mon, May 06, 2019 at 10:57:23PM -0700, enh wrote: > > I know not everything is a macro that can be tested; both myself and > > folks from other implementations are interested in developing an > > agreed-upon way to report availability of other extensions via macros, > > so that configure-style

Re: [Toybox] pathological case in sed s///g

2019-05-07 Thread Rich Felker
On Tue, May 07, 2019 at 05:27:24PM +0300, makep...@firemail.cc wrote: > > > wouldn't it be better to have more stuff along the lines of clang's > > __has_builtin/__has_feature/__has_include? they're already very useful > > for getting rid of this kind of stuff, and a __has_function would > >

Re: [Toybox] pathological case in sed s///g

2019-05-07 Thread makepost
> wouldn't it be better to have more stuff along the lines of clang's > __has_builtin/__has_feature/__has_include? they're already very useful > for getting rid of this kind of stuff, and a __has_function would > cover most of what's missing. Feature claims of clang and musl are giving us a

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread enh via Toybox
From: Rich Felker Date: Mon, May 6, 2019 at 7:40 PM To: Rob Landley Cc: toybox > On Mon, May 06, 2019 at 07:05:46PM -0500, Rob Landley wrote: > > On 5/6/19 12:48 PM, Rich Felker wrote: > > > On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote: > > >> Huh... I'll assume REG_STARTEND works

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rich Felker
On Mon, May 06, 2019 at 07:05:46PM -0500, Rob Landley wrote: > On 5/6/19 12:48 PM, Rich Felker wrote: > > On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote: > >> Huh... I'll assume REG_STARTEND works in bionic since you're pointing me > >> at it. > >> It's not in the regex man page but

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rob Landley
On 5/6/19 12:48 PM, Rich Felker wrote: > On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote: >> Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at >> it. >> It's not in the regex man page but it _is_ in the glibc headers... >> >>

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rich Felker
On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote: > Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at > it. > It's not in the regex man page but it _is_ in the glibc headers... > > https://github.com/bminor/glibc/commit/6fefb4e0b16 > > Looks like it went

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rob Landley
Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at it. It's not in the regex man page but it _is_ in the glibc headers... https://github.com/bminor/glibc/commit/6fefb4e0b16 Looks like it went into glibc in 2004, which is way past 7 years. I should poke Michael Kerrisk

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread enh via Toybox
i think you might be able to use REG_STARTEND to avoid the implicit strlen: https://www.freebsd.org/cgi/man.cgi?query=regex=3=freebsd-release-ports REG_STARTEND The string is considered to start at string + pmatch[0].rm_so and to end before the byte located at string + pmatch[0].rm_eo,

Re: [Toybox] pathological case in sed s///g

2019-05-05 Thread Rob Landley
On 5/3/19 2:42 PM, enh wrote: > On Fri, May 3, 2019 at 11:59 AM Rob Landley wrote: >> >> On 5/3/19 1:56 PM, Rob Landley wrote: >>> On 5/3/19 1:05 PM, enh wrote: >>> But yeah, the new pessimal case after the change I'm making now would be a >>> megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread enh via Toybox
On Fri, May 3, 2019 at 11:59 AM Rob Landley wrote: > > On 5/3/19 1:56 PM, Rob Landley wrote: > > On 5/3/19 1:05 PM, enh wrote: > > But yeah, the new pessimal case after the change I'm making now would be a > > megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output buffer > >

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 1:05 PM, enh wrote: > BSD seems to avoid all the copying? i think they have a "source" and > "destination" and only write to the latter (solving your issue), but > only move forwards (so i don't think they try to maintain it as "what > the whole result would look like if we only did this

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 1:56 PM, Rob Landley wrote: > On 5/3/19 1:05 PM, enh wrote: > But yeah, the new pessimal case after the change I'm making now would be a > megabyte of xyxyxyxy with 's/xy/x/g' _THAT_ would need the one output buffer > thing... And it can be avoided by an in-place copy that remembers

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread enh via Toybox
BSD seems to avoid all the copying? i think they have a "source" and "destination" and only write to the latter (solving your issue), but only move forwards (so i don't think they try to maintain it as "what the whole result would look like if we only did this many replacements", rather "here's

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/3/19 12:40 PM, Rob Landley wrote: > On 5/2/19 9:46 PM, enh via Toybox wrote: >> i've known about this for a couple of days and haven't had time to >> look at it properly yet, so i should mention it here... >> >> if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g', >> BSD or GNU

Re: [Toybox] pathological case in sed s///g

2019-05-03 Thread Rob Landley
On 5/2/19 9:46 PM, enh via Toybox wrote: > i've known about this for a couple of days and haven't had time to > look at it properly yet, so i should mention it here... > > if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g', > BSD or GNU sed finishes immediately, but toybox takes

[Toybox] pathological case in sed s///g

2019-05-02 Thread enh via Toybox
i've known about this for a couple of days and haven't had time to look at it properly yet, so i should mention it here... if you have a file with a 1MiB line of 'x'es and you sed 's/x/y/g', BSD or GNU sed finishes immediately, but toybox takes forever.