Re: grep: convert fgetln to getline

Lauri Tirkkonen Thu, 24 Jan 2019 03:58:41 -0800

On Thu, Jan 24 2019 04:40:08 -0700, Theo de Raadt wrote:
> >On Thu, Jan 24 2019 04:22:20 -0700, Theo de Raadt wrote:
> >> I would like to know if this does more malloc.  I worry it is an additional
> >> level of malloc per line.
> >
> >It does do more malloc than plain fgetln since fgetln does no copying,
> >but nowhere near every line. The same buffer lnbuf is used for each
> >line, and libc getline() reallocates it if it is not large enough.
> 
> If there is more allocation, it will be more expensive.  Our malloc
> has many checks and also fresh allocations trigger expensive code paths
> in mmap intentionally -- this makes our runtime somewhat slower, but finds
> and fixes bugs in the same software running on other operating systems
> who lack our attitude towards rigor over performance).
> 
> question is, are you fixing a stylistic nit, or making it fundamentally
> better.


I do think I am making it fundamentally better, since fgetln is a broken
interface (it returns lines that are not null terminated).

The actual reason I started doing this was porting grep to an OS without
fgetln(). But I don't think it's a bad thing to reduce fgetln usage in
OpenBSD either.

> A bunch of us run a whole ton of grep during our work...
> 
> that's my basic question.

Sure, I understand this and I do agree grep performance matters. I just
want to reiterate that this diff only affects performance if the mmap
code path is not taken (ie. either stream is not seekable (determined
with isatty), or if grep was built -DSMALL (bsd.rd grep I think), or if
mmap fails). And even then I think the cost is negligible: getline grows
the buffer in powers of 2, so only lines that happen to be twice as long
as any previously encountered line pay the price.

-- 
Lauri Tirkkonen | lotheac @ IRCnet

Re: grep: convert fgetln to getline

Reply via email to