Begin forwarded message:

> From: Dmitry Markman <[email protected]>
> Subject: Re: Strip away all HTML, leaving just the URLs
> Date: March 1, 2013 10:21:06 PM EST
> To: [email protected]
> 
> On Sat, Mar 2, 2013 at 12:38 PM, Nick <[email protected]> wrote:
>> I need to extract the URLs from a large number of HTML files. Basically,
>> take something like this:
>> 
>> <ul>
>> 
>> <li><a href="http://www.youtube.com"; class="youtube">YouTube</a></li>
>> <li><a href="http://www.facebook.com"; class="facebook">Facebook</a></li>
>> <li><a href="http://www.twitter.com"; class="twitter">Twitter</a></li>
>> </ul>
>> 
>> And output this:
>> http://www.youtube.com
>> http://www.facebook.com
>> http://www.twitter.com
> 
> 1. New -> Text Factory
> 2. Choose "Process Line containing"
> 3. Click options
> 4. set check box "use grep"
> 5. Find lines containing \"(http:.*?)\"
> 6. make sure that checkbox "delete matching lines" is unchecked
> 
> 7, click on the +
> 8. pick Replace all
> 9. click options
> 10. set check box "use grep"
> 11. Search for ^.*?\"(http:.*?)\".*$
> 12. replace with \1
> 
> run
> 
> 
> 
> Dmitry Markman
> 

Dmitry Markman

-- 
-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to