Re: WWW::Mechanize : Is immediate caching of images possible?
On 1/4/08, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: my $mech = WWW::Mechanize-new(); $mech-get($url); my @links = $mech-find_all_images(); foreach my $link (@links){ my $imageurl = $link-url_abs(); $imageurl =~ m/([^\/]+)$/; $mech-get($imageurl, ':content_file' = $1); } My current problem with this is that I'm trying to dl an image generated with information from the session of the original get($url). How is the session created, and passed to the image? If the session is established by a cookie within the HTTP response headers, your first request should create the cookie and WWW::Mechanize should remember it. Your subsequent requests for the images should be sent the same cookie, and your image requests should receive it and be able to associate it with the same session. If your session is established by a cookie set by JavaScript, then you'll need to parse the cookie value out of the page (JavaScript) content yourself and set up your own cookie jar explicitly, and populate it with this value once you discover it. A web browser isn't doing too much more than the logic you have above. HTTP is inherently stateless, and aside from mechanisms like the HTTP Referer request header, cookies, and the URL itself, there's little else that the browser can do to carry information from one request to the next. I believe WWW::Mechanize supports all of this, so if the above doesn't help, more information may be needed. David
WWW::Mechanize : Is immediate caching of images possible?
Traditionally when using WWW::Mechanize to dl images I first fetch the root page: my $mech = WWW::Mechanize-new(); $mech-get($url); then proceed to find all images and 'get' them one by one: (forgive the crude code) my @links = $mech-find_all_images(); foreach my $link (@links){ my $imageurl = $link-url_abs(); $imageurl =~ m/([^\/]+)$/; $mech-get($imageurl, ':content_file' = $1); } My current problem with this is that I'm trying to dl an image generated with information from the session of the original get($url). It's not a static *.jpg or something simple it's a black box that displays an image relevant to the session. Meaning, when I fetch the image (http://www.domain.com/image/ which is embedded in the page) as shown above, it's a new request and I get a completely random image. Is there a way to cache the images that are loaded during the initial get($url) so that the image matches the content of the page retrieved? Or even to capture the session information transmitted to the black box, domain.com/image/, so I can clone the information and submit it with the get($imageurl)? Ideally I would effectively like a routine like: $mech- getComplete($url,$directory); which would save the source and images/ etc associate with the page. Analogous to the Save- Web page, Complete in Firefox. Thanks all. I think I'm getting pretty proficient with WWW::Mechanize but don't be afraid to respond like I am an idiot so that we know your answer doesn't go over my head. Hikari