Re: Changing a text with Grep

Neil Faiman Thu, 17 Oct 2024 11:00:14 -0700

On Oct 17, 2024, at 6:14 AM, mrcmrc <[email protected]> wrote:
> 
> Hi all, I would need help writing a Grep syntax to change a string of text 
> like this:
> 
> House, Big Apple, Today, Movie
> 
> into this:
> 
> [[House]] | [[Big Apple]] | [[Today]] | [[Movie]]


Below is a solution which will do almost exactly what you want. 

Almost exactly, because it will give you 

[[House]]|[[Big Apple]]|[[Today]]|[[Movie]]|

Note the extra vertical bar at the end of the line. The simplest thing is to 
follow up this up by removing the trailing vertical bars with
Find:|$ 
Replace: (nothing)
A BBEdit text factory makes it simple to automate doing two or more 
find-and-replaces .

The problem is that you want to change every “fragment” in a line to 
“[[fragment]]|” except for the last fragment, which you want to change to 
“[[fragment]]”, and there is no way to write a single regular expression 
find-and-replace that has chooses among different replacement patterns based on 
the content or context of the matched pattern.

Your example leaves a lot of details unspecified. Here are the assumption my 
solution makes about exactly what you want:

Divide each text line into fragments. 
Each fragment is a string of text (possibly empty) that does not contain any 
commas, and that does not have any leading or trailing spaces.
Adjacent fragments are separated by a comma which might have spaces on either 
side.
Spaces at the beginning or end of the line or around a comma are ignored.
Put double square brackets around each fragment and vertical bars between the 
bracketed fragments.
Discard the comma/space separators and leading and trailing spaces.

If that is what you wanted, this pattern will do the job:

Find: (?x) (?# 1: Leading space) [ ]* (?# 2: Fragment) ([^\n,]*?) (?# 3: 
trailing space) [ ]* (?# 4: separator) (?:,|(\n))
Replace: [[\1]]|\2

It works like this:
The leading space component [ ]* matches spaces before the pattern, but doesn’t 
include them in the fragment. (This will only match at the start of a line.)
The capture group ([^\n,]*?) defines the actual fragments. It matches a string 
of characters which are not commas or end-of-lines. Note the use of the 
non-greedy repetition operator *?. This means that the fragment is the shortest 
string which matches this sub-pattern, while still allowing the remainder of 
the pattern to match. Trailing spaces will be matched by component 3 below but 
won’t be included in the fragment.
The trailing space component [ ]* matches spaces after the fragment, but 
doesn’t include them in the fragment.
The separator component (?:,|(\n)) matches either a comma separator or the end 
of the line.
Note the use of (?:…), which means that these are “grouping” parentheses, not 
“capturing” parentheses. The separator is part of the pattern, but it isn’t 
part of the fragment.
The new-line character is enclosed in capturing parentheses. This means that 
the pattern match for the last fragment in a line captures the new-line as 
capture group 2 (which is otherwise empty), and the \2 at the end of the 
replacement causes the newline to be included following the fragment in the 
replacement string. 

A find-and-replace-all should match the entire text of the input line. The 
replacement contains each captured fragment, enclosed in doubled square 
brackets and a trailing vertical bar, and with the captured new-line at the end 
of the replacement for the last fragment. 



-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or believe that the application isn't working correctly, please email 
"[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
<https://mastodon.social/@bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bbedit/32C4930B-CEC7-4074-B858-FC73B136F3B4%40faiman.org.

Re: Changing a text with Grep

Reply via email to