On Wed, Mar 6, 2013 at 10:58 AM, Nick <[email protected]> wrote:
> Thanks, that did exactly what I was looking for. But, I realized I also need
> to do this for anchor tags with relative links, such as:
> <a href="/xxx/yyyyyyyy/zzzzzzz.shtml">ordinateur de bureau</a>
>

I'm going to re-post my previous suggestion because it: will work for
all of these cases, boils down to one line, is extremely easy to
understand, and involves no complicated regex knowledge.

Create a BBEdit Text Filter which does this:

        cat $@ | lynx -dump -listonly -nonumbers -stdin -width 1024

That's it. (Assuming lynx is installed, if it's not, it's easy to get.)

Translation:

cat $@ = Take the standard input

send (pipe) it to lynx

-dump = dump whatever lynx processed to stdout
-listonly = create a list of links as output
-nonumbers = don't number links in the output (1…2…3…)
-stdin = tell lynx to look for stdin
-width 1024 = how wide you want the output to be

`lynx` is designed to parse HTML in all sorts of weird forms, and it
will handle cases where there are relative links with or without a
<BASE>.

I realize that there are always more than one way to do something, but
rather than try to reinvent HTML parsers, I'd rather use a program
which has been developed for years to do what you're trying to do.

`lynx` may not be part of the standard OS X install, but it's easy to
get. If you use Homebrew, it's just `brew install lynx` otherwise go
to Rudix and get the installer.

You can download the full version of this Text Filter which I use (and
which includes some additional error checking) here:
http://db.tt/fcAIRZuU

TjL

-- 
-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to