Begin forwarded message:
> From: Dmitry Markman <[email protected]> > Subject: Re: Strip away all HTML, leaving just the URLs > Date: March 1, 2013 10:21:06 PM EST > To: [email protected] > > On Sat, Mar 2, 2013 at 12:38 PM, Nick <[email protected]> wrote: >> I need to extract the URLs from a large number of HTML files. Basically, >> take something like this: >> >> <ul> >> >> <li><a href="http://www.youtube.com" class="youtube">YouTube</a></li> >> <li><a href="http://www.facebook.com" class="facebook">Facebook</a></li> >> <li><a href="http://www.twitter.com" class="twitter">Twitter</a></li> >> </ul> >> >> And output this: >> http://www.youtube.com >> http://www.facebook.com >> http://www.twitter.com > > 1. New -> Text Factory > 2. Choose "Process Line containing" > 3. Click options > 4. set check box "use grep" > 5. Find lines containing \"(http:.*?)\" > 6. make sure that checkbox "delete matching lines" is unchecked > > 7, click on the + > 8. pick Replace all > 9. click options > 10. set check box "use grep" > 11. Search for ^.*?\"(http:.*?)\".*$ > 12. replace with \1 > > run > > > > Dmitry Markman > Dmitry Markman -- -- You received this message because you are subscribed to the "BBEdit Talk" discussion group on Google Groups. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at <http://groups.google.com/group/bbedit?hl=en> If you have a feature request or would like to report a problem, please email "[email protected]" rather than posting to the group. Follow @bbedit on Twitter: <http://www.twitter.com/bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
