I’m not sure this can be made to be reliable. Regular expressions can’t balance tags, so:

```html
<div>
 Leave me out!
 <div class="main-content">
  Include me!
   <div>And me!</div>
  And also me!
 </div>
 But not me.
</div>
```

Will result in:

```html
<div class="main-content">
  Include me!
   <div>And me!</div>
```

If you make the pattern greedy, you’ll get:

```html
<div class="main-content">
  Include me!
   <div>And me!</div>
  And also me!
 </div>
 But not me.
</div>
```

Just fine if you don’t have any DIVs inside your main-content DIV, but how likely is that?

Better off using a tool that’s designed to manipulate HTML. Some options [here](https://superuser.com/questions/528709/command-line-css-selector-tool/528728).

It’d be lovely if BBEdit could allow find/replace based on CSS selectors or XPath expressions in addition to text and regexps. But presumably that would be a large undertaking.

Just my 2¢.
-sam

On 26 Apr 2019, at 16:11, Patrick Woolsey wrote:

On 4/26/19 at 3:07 PM, [email protected] (Phil Emery) wrote:

Ideally it would be great to delete all other content from each existing file.

OK, thanks and in that case, you should be able to obtain the desired outcome by performing a multi-file search & replace with "Grep" enabled and patterns like these:

Find:      \A(?s).+?(<div class="main-content">(?s).+?</div>)(?s).+

Replace:   \1

and in short, here's how the patterns work:

The Find pattern begins by matching at the start of the document \A and then _non-greedily_ matches ? one or more instances of any character .+ _including_ line breaks (achieved by pre-pending (?s) to the .) and followed by a single _sub-pattern_, whose contents are enclosed in parentheses ( ) and consist of the opening div followed by one or more characters in another non-greedy match across lines and then a closing div, and finally matching any characters remaining in the document, including line breaks (?s).+

[NB: You'll need to adjust the exact form of the desired <div> to suit your content, i.e. depending whether these sections are identified by
    'class', 'name', or 'id'.]

The Replace pattern then reinserts only the contents of the matched subpattern (consisting of the desired div, its contents, and the closing div), thus effectively deleting everything else.

As always, I recommend you try this procedure out on a few sample files or a cloned copy before applying it to your actual data, just to make sure it's doing what you expect/want. :-)


Regards,

  Patrick Woolsey
==
Bare Bones Software, Inc.             <https://www.barebones.com/>

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
--- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
--- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.

Reply via email to