On 06/03/2013 15:58, Nick wrote:
Thanks, that did exactly what I was looking for. But, I realized I
also need to do this for anchor tags with relative links, such as:
a href=/xxx//zzz.shtmlordinateur de bureau/a
A text filter something like this should do everything you want
At 10:14 + 3/7/13, John Delacour wrote:
On 06/03/2013 15:58, Nick wrote:
Thanks, that did exactly what I was looking for. But, I realized I also need
to do this for anchor tags with relative links, such as:
a href=/xxx//zzz.shtmlordinateur de bureau/a
A text filter something
On Wed, Mar 6, 2013 at 10:58 AM, Nick grizfa...@gmail.com wrote:
Thanks, that did exactly what I was looking for. But, I realized I also need
to do this for anchor tags with relative links, such as:
a href=/xxx//zzz.shtmlordinateur de bureau/a
I'm going to re-post my previous
message:
*From: *Dmitry Markman dmar...@me.com javascript:
*Subject: **Re: Strip away all HTML, leaving just the URLs*
*Date: *March 1, 2013 10:21:06 PM EST
*To: *bbe...@googlegroups.com javascript:
On Sat, Mar 2, 2013 at 12:38 PM, Nick griz...@gmail.com javascript:
wrote:
I need to extract
this will pick up anything like:
img height=239 alt=trix_5 width=357 src=Thumbnails/4.jpg
giving you
239
for example
where do I
check Copy to new document
?
On 02/03/2013 11:00, LuKreme wrote:
In our previous episode (Friday, 01-Mar-2013), Nick said:
ul
lia
This might be better done on the command line.
$ grep -Po '(?=href=)[^]+' [file name]
This will give you the content of every href attribute in the file, and
nothing else. Just a list of URLs.
If there are any URLs you want to exclude, such as mailto:, javascript: or
anchors (e.g.
In our previous episode (Saturday, 02-Mar-2013), Ron Catterall said:
On 02/03/2013 11:00, LuKreme wrote:
In our previous episode (Friday, 01-Mar-2013), Nick said:
ul
lia href=http://www.youtube.com;
class=youtubeYouTube/a/li
lia href=http://www.facebook.com;
On Mar 02, 2013, at 10:13, Dave dave.live...@gmail.com wrote:
This might be better done on the command line.
$ grep -Po '(?=href=)[^]+' [file name]
__
That's not going to work on a stock Mountain Lion install (where the -P
In our previous episode (Friday, 01-Mar-2013), Nick said:
ul
lia href=http://www.youtube.com;
class=youtubeYouTube/a/li
lia href=http://www.facebook.com;
class=facebookFacebook/a/li
lia href=http://www.twitter.com;
class=twitterTwitter/a/li
Hi,
I need to extract the URLs from a large number of HTML files. Basically,
take something like this:
ul
lia href=http://www.youtube.com; class=youtubeYouTube/a/li
lia href=http://www.facebook.com; class=facebookFacebook/a/li
lia href=http://www.twitter.com; class=twitterTwitter/a/li
/ul
And
On Sat, Mar 2, 2013 at 12:38 PM, Nick grizfa...@gmail.com wrote:
I need to extract the URLs from a large number of HTML files. Basically,
take something like this:
ul
lia href=http://www.youtube.com; class=youtubeYouTube/a/li
lia href=http://www.facebook.com; class=facebookFacebook/a/li
Begin forwarded message:
From: Dmitry Markman dmark...@me.com
Subject: Re: Strip away all HTML, leaving just the URLs
Date: March 1, 2013 10:21:06 PM EST
To: bbedit@googlegroups.com
On Sat, Mar 2, 2013 at 12:38 PM, Nick grizfa...@gmail.com wrote:
I need to extract the URLs from a large
At 15:56 +1300 on 03/02/2013, Miraz Jordan wrote about Re: Strip away
all HTML, leaving just the URLs:
The inefficient way I'd do it is:
1] replace all with \r (puts each URL on its own line)
2] Text menu - Process lines containing http:// (put in a new document)
Done.
In lieu of step 2
Here's a text filter which will do just that, using `lynx` which
unfortunately is not installed in OS X by default, but you can acquire
it either from Homebrew (my preference) or if you want a precompiled
binary in a nice installer:
http://rudix.googlecode.com/files/lynx-2.8.7-3.pkg
TjL
14 matches
Mail list logo