Re: [Tutor] Pictures

louis leichtnam Wed, 04 May 2011 19:50:33 -0700

Hello,

I tried it and it works, though if I change the web site from
http://finance.blog.lemonde.fr to http://www. ...something else it doesn't
work.


DO I have to change the  '([\S]+)'
in x=re.findall(r"<img\s+src='([\S]+)'",page)? but into what?

Thanks a lot

2011/4/28 naheed arafat <[email protected]>

> Observing the page source i think :
>
>     page=urllib.urlopen('http://finance.blog.lemonde.fr').read()
>
>     x=re.findall(r"<img\s+src='([\S]+)'",page)
>     #matches image source of the pattern like:
>     #<img src='
> http://finance.blog.lemonde.fr/filescropped/7642_300_400/2011/04/1157.1301668834.jpg
> '
>     y=re.findall(r"<img\s+src=\"([\S]+)\"",page)
>     # matches image source of the pattern like:
>     # <img src="
> http://s2.lemde.fr/image/2011/02/16/87x0/1480844_7_87fe_bandeau-lycee-electrique.jpg
> "
>     x.extend(y)
>     x=list(set(x))
>     for img in x:
>         image=img.split('.')[-1]
>         if image=='jpg':
>             print img
>
>
>

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Pictures

Reply via email to