I don't have access to a newer gawk where I did the initial timings, but I
ran an almost identical test on my home machine.

    grep (v3.11):                                              ~0.60s
    perl (v5.38.0):                                            ~3.21s
    gawk (v4.0.2 built from source with `-O3 -march=native`): ~10.22s
    gawk (v5.2.2 built from source with `-O3 -march=native`):  ~4.95s

If grep will never add this functionality I'll survive, it just seemed like
it might not be too much work to implement, and would probably still be
much faster than using awk/perl. I've never looked at the grep source code
before, but could be tempted to try implementing it myself if there was any
chance of the path being accepted.

Dan

On Mon, Aug 21, 2023 at 2:37 PM <arn...@skeeve.com> wrote:

> Gawk 4.0.2 is 11 years old. Try timing the current version,
> I'll bet it's faster.  And it solves your problem NOW,
> instead of waiting for a feature that the grep developers
> aren't likely to add.
>
> My two cents of course.
>
> Arnold
>
> Daniel Green <ddgr...@gmail.com> wrote:
>
> > That works, as well as the Perl version I've been using:
> >
> >     perl -ne 'print if ($. == 1 || /pattern/)'
> >
> > But timings for a real-life example (3GB file with ~16m lines, CentOS 7)
> > show the problem:
> >
> >     grep (v2.20):    ~1.15s
> >     perl (v5.36.1):  ~4.48s
> >      awk (v4.0.2):  ~10.81s
> >
> > Admittedly grep is just searching in those timings, but I suspect it
> could
> > accomplish the full task with a minimal decrease in speed.
> >
> > Dan
> >
> > On Mon, Aug 21, 2023 at 12:57 PM <arn...@skeeve.com> wrote:
> >
> > > Daniel Green <ddgr...@gmail.com> wrote:
> > >
> > > > I'm frequently searching CSV files with 20-30 columns, and when
> there's a
> > > > hit it can be hard to know what the columns are. An option to also
> print
> > > > the first line of a file (either always, or only if that file had a
> match
> > > > to the pattern) in addition to any hits would be nice.
> > > >
> > > > Thanks,
> > > > Dan
> > >
> > > It sounds like awk would be a better tool:
> > >
> > >         awk 'FNR == 1 || /pattern/' files ...
> > >
> > > should do the trick.
> > >
> > > HTH,
> > >
> > > Arnold
> > >
>

Reply via email to