Observing the page source i think :
page=urllib.urlopen('http://finance.blog.lemonde.fr').read()
x=re.findall(r"<img\s+src='([\S]+)'",page)
#matches image source of the pattern like:
#<img src='
http://finance.blog.lemonde.fr/filescropped/7642_300_400/2011/04/1157.1301668834.jpg
'
y=re.findall(r"<img\s+src=\"([\S]+)\"",page)
# matches image source of the pattern like:
# <img src="
http://s2.lemde.fr/image/2011/02/16/87x0/1480844_7_87fe_bandeau-lycee-electrique.jpg
"
x.extend(y)
x=list(set(x))
for img in x:
image=img.split('.')[-1]
if image=='jpg':
print img
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor