I’m not sure this can be made to be reliable. Regular expressions
can’t balance tags, so:
```html
<div>
Leave me out!
<div class="main-content">
Include me!
<div>And me!</div>
And also me!
</div>
But not me.
</div>
```
Will result in:
```html
<div class="main-content">
Include me!
<div>And me!</div>
```
If you make the pattern greedy, you’ll get:
```html
<div class="main-content">
Include me!
<div>And me!</div>
And also me!
</div>
But not me.
</div>
```
Just fine if you don’t have any DIVs inside your main-content DIV, but
how likely is that?
Better off using a tool that’s designed to manipulate HTML. Some
options
[here](https://superuser.com/questions/528709/command-line-css-selector-tool/528728).
It’d be lovely if BBEdit could allow find/replace based on CSS
selectors or XPath expressions in addition to text and regexps. But
presumably that would be a large undertaking.
Just my 2¢.
-sam
On 26 Apr 2019, at 16:11, Patrick Woolsey wrote:
On 4/26/19 at 3:07 PM, [email protected] (Phil Emery) wrote:
Ideally it would be great to delete all other content from each
existing file.
OK, thanks and in that case, you should be able to obtain the desired
outcome by performing a multi-file search & replace with "Grep"
enabled and patterns like these:
Find: \A(?s).+?(<div class="main-content">(?s).+?</div>)(?s).+
Replace: \1
and in short, here's how the patterns work:
The Find pattern begins by matching at the start of the document \A
and then _non-greedily_ matches ? one or more instances of any
character .+ _including_ line breaks (achieved by pre-pending (?s) to
the .) and followed by a single _sub-pattern_, whose contents are
enclosed in parentheses ( ) and consist of the opening div followed by
one or more characters in another non-greedy match across lines and
then a closing div, and finally matching any characters remaining in
the document, including line breaks (?s).+
[NB: You'll need to adjust the exact form of the desired <div> to
suit
your content, i.e. depending whether these sections are identified
by
'class', 'name', or 'id'.]
The Replace pattern then reinserts only the contents of the matched
subpattern (consisting of the desired div, its contents, and the
closing div), thus effectively deleting everything else.
As always, I recommend you try this procedure out on a few sample
files or a cloned copy before applying it to your actual data, just to
make sure it's doing what you expect/want. :-)
Regards,
Patrick Woolsey
==
Bare Bones Software, Inc. <https://www.barebones.com/>
--
This is the BBEdit Talk public discussion group. If you have a feature
request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
--- You received this message because you are subscribed to the Google
Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.
--
This is the BBEdit Talk public discussion group. If you have a
feature request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.