On 28/11/2013 15:19, TheRandomPast . wrote:
Hi,
I've created a script that allows me to see how many images are on a
webpage and their URL however now I want to download all .jpg images
from this website and save them onto my computer. I've never done this
before and I've become a little confused as to where I should go next.
Can some kind person take a look at my code and tell me if I'm
completely in the wrong direction?
Just to clarify what I want to do is download all .jpg images on
dogpicturesite.com <http://dogpicturesite.com> and save them to a
directory on my computer.
Sorry if this is a really stupid question.
import traceback
import sys
from urllib import urlretrieve
try:
print ' imagefiles()'
The regex matches only the names of the images. Try matching their
entire URLs.
images = re.findall(r'([-\w]+\.(?:jpg))', webpage)
For each URL, download the image and save it into the folder. You can
make a path for each image by joining (That's a hint! Look in os.path)
the path of the folder with the name of the image.
urlretrieve('http://dogpicturesite.com/', 'C:/images)
print "Downloading Images....."
time.sleep(5)
print "Images Downloaded."
Don't use a 'bare' except. It swallows EVERY exception. Catch only what
you're willing to handle, and let the other exceptions just show
themselves.
except:
print "Failed to Download Images"
raw_input('Press Enter to exit...')
sys.exit()
def main():
sys.argv.append('http://dogpicturesite.com/')
if len(sys.argv) != 2:
print '[-] Image Files'
return
page = webpage.webpage(sys.argv[1])
imagefiles(webpage)
--
https://mail.python.org/mailman/listinfo/python-list