> > The file "temp.html" is definitely different than the first run, but > > still not anything close to www.python.org . Any other suggestions? > > If you mean that the page looks different in a browser, for one thing > you have to download the css files too. Here's the relevant extract > from the main page: > > <link media="screen" href="styles/screen-switcher-default.css" > type="text/css" id="screen-switcher-stylesheet" rel="stylesheet" /> > <link media="scReen" href="styles/netscape4.css" type="text/css" > rel="stylesheet" /> > <link media="print" href="styles/print.css" type="text/css" > rel="stylesheet" /> > <link media="screen" href="styles/largestyles.css" type="text/css" > rel="alternate stylesheet" title="large text" /> > <link media="screen" href="styles/defaultfonts.css" type="text/css" > rel="alternate stylesheet" title="default fonts" /> > > You may either hardcode the urls of the css files, or parse the page, > extract the css links and normalize them to absolute urls. The first is > simpler but the second is more robust, in case a new css is added or an > existing one is renamed or removed. > > George
Thanks for the information on CSS. I'll look into that later, but now my question is on the first two lines of HTML code. Here's my latest python code: >>> import urllib >>> web_page = urllib.urlopen("http://www.python.org") >>> fileTemp = open("temp.html", "w") >>> web_page_contents = web_page.read() >>> fileTemp.write(web_page_contents) >>> fileTemp.close() Here are the first two lines of temp.html: 1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/x html1/DTD/xhtml1-transitional.dtd"> 2 <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"> Here are the first two lines of www.python.org as saved from Firefox: 1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/x html1/DTD/xhtml1-transitional.dtd"> 2 <html xml:lang="en" xmlns="http://www.w3.org/1999/xhtml" lang="en"><head> Lines one are identical. Lines two are different. Why would lines two differ? Hmmmm... Thanks, Pete -- http://mail.python.org/mailman/listinfo/python-list