Re: Advanced Grep querys

Patrick Woolsey Fri, 26 Apr 2019 13:38:22 -0700

That is true and my apologies for not including a suitablecaveat in my prior post, though sometimes it can still behelpful to start with a simple solution and work out from there. :-)


Regards,


  -- Patrick


On 4/26/19 at 4:21 PM, [email protected] (Sam Hathaway) wrote:

I’m not sure this can be made to be reliable. Regular expressions can’t balance 
tags, so:

```html
<div>
Leave me out!
<div class="main-content">
Include me!
<div>And me!</div>
And also me!
</div>
But not me.
</div>
```

Will result in:

```html
<div class="main-content">
Include me!
<div>And me!</div>
```

If you make the pattern greedy, you’ll get:

```html
<div class="main-content">
Include me!
<div>And me!</div>
And also me!
</div>
But not me.
</div>
```

Just fine if you don’t have any DIVs inside your main-content DIV, but how 
likely is that?
Better off using a tool that’s designed to manipulate HTML.Some options [here](https://superuser.com/questions/528709/command-line-css-selector-tool/528728).
It’d be lovely if BBEdit could allow find/replace based onCSS selectors or XPath expressions in addition to text andregexps. But presumably that would be a large undertaking.
Just my 2¢.
-sam

On 26 Apr 2019, at 16:11, Patrick Woolsey wrote:
On 4/26/19 at 3:07 PM, [email protected] (Phil Emery) wrote:
Ideally it would be great to delete all other content fromeach existing file.
OK, thanks and in that case, you should be able to obtain thedesired outcome by performing a multi-file search & replacewith "Grep" enabled and patterns like these:
Find:      \A(?s).+?(<div class="main-content">(?s).+?</div>)(?s).+

Replace:   \1

and in short, here's how the patterns work:
The Find pattern begins by matching at the start of thedocument \A and then _non-greedily_ matches ? one or moreinstances of any character .+ _including_ line breaks(achieved by pre-pending (?s) to the .) and followed by asingle _sub-pattern_, whose contents are enclosed inparentheses ( ) and consist of the opening div followed by oneor more characters in another non-greedy match across linesand then a closing div, and finally matching any charactersremaining in the document, including line breaks (?s).+
[NB: You'll need to adjust the exact form of the desired <div>to suityour content, i.e. depending whether these sections areidentified by
'class', 'name', or 'id'.]
The Replace pattern then reinserts only the contents of thematched subpattern (consisting of the desired div, itscontents, and the closing div), thus effectively deletingeverything else.
As always, I recommend you try this procedure out on a fewsample files or a cloned copy before applying it to youractual data, just to make sure it's doing what youexpect/want. :-)
Regards,

Patrick Woolsey
==
Bare Bones Software, Inc.             <https://www.barebones.com/>

--
This is the BBEdit Talk public discussion group. If you have afeature request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
--- You received this message because you are subscribed tothe Google Groups "BBEdit Talk" group.To unsubscribe from this group and stop receiving emails fromit, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.

--

This is the BBEdit Talk public discussion group. If you have afeature request or need technical support, please email

"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>

---You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.

Re: Advanced Grep querys

Reply via email to