Re: how to extract page-URL using BeautifulSoup

MRAB Thu, 31 Oct 2013 10:39:17 -0700

On 31/10/2013 15:59, bhaktanish...@gmail.com wrote:

I want to extract the page-url. for example:
if i have this code


import urllib2
from bs4 import BeautifulSoup
link = "http://www.google.com";
page = urllib2.urlopen(link).read()
soup = BeautifulSoup(page)

then i can extract title of page by:

title = soup.title

but i want to know that how to extract page-URL from "soup" that will be 
"http://www.google.com";

Have a look at what you're passing to BeautifulSoup (save it to a file
and look at it in an editor). It's HTML. Does it contain anything that
says where it came from? No. So BeautifulSoup can't know either.

All BeautifulSoup does is parse the HTML that it's given.
--
https://mail.python.org/mailman/listinfo/python-list

Re: how to extract page-URL using BeautifulSoup

Reply via email to