Re: [SLUG] Extracting URL's from a web page

Zhasper Sun, 22 Jul 2007 19:32:33 -0700

Once more, this time to list

http://aspn.activestate.com/ASPN/docs/ActivePython/2.5/diveintopython/html/html_processing/extracting_data.html
has some sample recipes that should give you a good starting point.


On 22/07/07, Sean Murphy <[EMAIL PROTECTED]> wrote:

All.

I wish to extract specific links from a web page.  A part of the requirement
is to be able to drill down about three to four levels to extract the
information.

The first page shall have 26 to 30 links I want to extract.  The levels
beneath the first page is a lot higher.

I only want the text, not the underlying HTML code, but if I get the URL
path I can deal.

Wget grabs to much information for my use.  I only know the higher levels of
Perl and I am starting to learn Ruby.  So my coding under Linux is not very
advance at all.


Sean Murphy
Skype: smurf20005

Life is a challenge, treat it that way.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html



--
There is nothing more worthy of contempt than a man who quotes himself
- Zhasper, 2004
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] Extracting URL's from a web page

Reply via email to