On 4/24/19 5:50 PM, enh wrote: > On Wed, Apr 24, 2019 at 3:40 PM Rob Landley <[email protected]> wrote: >> >> On 4/23/19 8:06 PM, enh wrote: >>> (sorry. busy.) >> >> Sigh. I'll add a config symbol you can switch on, with help text complaining >> about it. > > well, it's not necessarily grep i'm worried about. i think if all the > other toys are now going to be line-buffered, it doesn't make much > sense to single out grep like this.
Neither do I. That said, after writing up a point by point rebuttal to your message, I benchmarked the code (silly to argue about something we haven't tested the speed of) and found out that devuan's grep takes 8 seconds to do something mine takes 20 to do _before_ the output buffering changed. Which is unacceptable and made the argument we were having kinda moot. So now I'm staring at do_lines() and seeing if I can get it to do large (at least 64k) block reads where it doesn't memcpy() the data out into individual lines but passes a pointer into the buffer where possible. (And this lets me replace get_line(fd) which has been on the todo list forever, although moving the all to do_lines() and a callback is probably the best way...) I might even be able to get it to mmap() and mremap() the data traversing down a file without too intrusive a second path. (Does madvise(MADV_WILLNEED) persist across mremaps of the same mapping, or do I need to call it again?) I _also_ have a todo item WAY down my list that if toysh has a "find | grep" pipeline and both are toybox commands it can automatically -print0 -0 on it, but I'm not sure the proper semantics for triggering that. (Or saying it _shouldn't_ do that, which is why it's way down on the list...) >> Does the attached work/suffice? (I _think_ setlinebuf(stdout) followed by >> setvbuf() something else is legal as long as we haven't output anything to >> the >> stream yet? But the man page is unclear. Presumably the admonition is because >> pending unflushed data in the buffer would get discarded, but they don't >> _say_...) > > (POSIX explicitly says that it's only valid before any output, but > bionic [and BSDs] flush output data and discard input data if you call > setvbuf [and the others are implemented in terms of setvbuf].) musl doesn't appear bothered either. I read that as saying we can do it twice before any output, but would be amazed if glibc ever tested that, yet care less and less about breaking anything with "gnu" on it each year. It comes pre-broken by the gnu all over it. Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
