On May 7, 1:40 am, Jetus <[EMAIL PROTECTED]> wrote: > On May 4, 7:22 am, [EMAIL PROTECTED] wrote: > > > > > > > On May 4, 12:33 am, "Gabriel Genellina" <[EMAIL PROTECTED]> > > wrote: > > > > En Sun, 04 May 2008 01:33:45 -0300, Jetus <[EMAIL PROTECTED]> escribió: > > > > > Is there a good place to look to see where I can find some code that > > > > will help me to save webpage's links to the local drive, after I have > > > > used urllib2 to retrieve the page? > > > > Many times I have to view these pages when I do not have access to the > > > > internet. > > > > Don't reinvent the wheel and use wgethttp://en.wikipedia.org/wiki/Wget > > > > -- > > > Gabriel Genellina > > > A lot of the functionality is already present. > > > import urllib > > urllib.urlretrieve( 'http://python.org/', 'main.htm' ) > > from htmllib import HTMLParser > > from formatter import NullFormatter > > parser= HTMLParser( NullFormatter( ) ) > > parser.feed( open( 'main.htm' ).read( ) ) > > import urlparse > > for a in parser.anchorlist: > > print urlparse.urljoin( 'http://python.org/', a ) > > > Output snipped: > > > ...http://python.org/psf/http://python.org/dev/http://python.org/links/h... > > ... > > How can I modify or add to the above code, so that the file references > are saved to specified local directories, AND the saved webpage makes > reference to the new saved files in the respective directories? > Thanks for your help in advance.- Hide quoted text - > > - Show quoted text -
You'd have to convert filenames in the loop to a file system path; try writing as is with makedirs( ). You'd have to replace contents in a file for links, so your best might be prefixing them with localhost and spawning a small bounce-router. -- http://mail.python.org/mailman/listinfo/python-list