vi: count occurrences of a substring

ropers Sat, 04 Sep 2021 13:15:40 -0700

On 04/09/2021, Marc Chantreux <[email protected]> wrote:
> hello,
>
>>   :!sed s/abc/abc\n/g % | grep -c abc
>
> Note: in sed, "what i just matched" is noted &


Oh, that's good, thank you.

  *Shoulda seenit on the man page -- butta dinnt.*

>From sed(1):
> An ampersand (‘&’) appearing in the replacement is replaced by the string 
> matching the regular expression.

>> Googled information suggests that the opposite of what's described in
>> the man page may be true:  You CAN use a literal newline, but you
>> can't use \n.
>
> BSD sed is more litteral AFAIK so you need to escape a real 0x10 but
> both GNU and BSD support escaped newlines:
>
>         sed 's/abc/&\
>         /g'

So is this incorrect? <https://man.openbsd.org/sed#SED_REGULAR_EXPRESSIONS>:

> The escape sequence \n matches a newline character embedded in the pattern 
> space. You can't, however, use a literal newline character in an address or 
> in the substitute command.

> This doesn't help in vi so you can fake it for a moment using tr:
>
>         sed 's/abc/&œ/g' | tr œ '\n'

Like I mentioned elsewhere, I'm *really* not a fan of the "let's
assume the input won't contain THIS" approach, and like I also
mentioned in another mail, I would prolly write a script absent
superior alternatives, so yeah, I agree:

> Another solution is to write commands for this kind of tasks:
>
> <<\. cat > ~/x
> #! /bin/ksh
>
> sed -r 's/a/&\
> /g'
> .

Wait, hold up, I'm not familiar with this input redirection idiom.
Could you explain?  Why the double <, and why does it not work with a single <?
Also, could you explain the escaped period?[0]  This is very hard to google.

> then from vi, :w !~/x

While we're at this, we should probably try and complete the script
(which also needs the chmod +x treatment), so this will make more
sense to all.
Like so:

  #!/bin/sh
  sed -E 's/'$1'/&\
  /g' | grep -c $1

Then, from the shell:

  $ <FILE count abc

Or from inside (n)vi:
  (I named my script count and put it in ~/bin/, which is in my PATH.)

  :!<% count abc

Or ":w !<% count abc", which is arguably better and just a tiny bit longer.
But there's no shame in ":!cat % | count abc" or ":!cat FILE | count
abc" either.  The best invocation is the one you remember.  *Now
KISS.*

An idea might be to use $@ instead of $1 in the script, but I haven't
really thought through the implications, and I'm not sure how to
reliably quote that after grep -c.  If anyone wants to opine on that,
shoot; but I'll prolly leave this for now.  $1 feels safer and more
KISS-compliant as well.

>> literal carriage return, not a literal newline (^J).  That's the case
>> on Linux as well, and I don't know why.
>
> neither do i.
>
>> Your new subject line is slightly imprecise, as words are usually
>> whitespace-delimited, and I was "looking for a way to count
>> occurrences of
>> 'abc' in FILE".  Not every substring is a word.
>
> right ... wasn't thinking that much to the name. sorry :)

Not to worry, and thank you so much!
Ian

> regards
> marc
>

fn:
[0] as the actress said to the ST ven... never mind

vi: count occurrences of a substring

Reply via email to