Greetings,

Having a problem parsing specific IMG tags from web pages.
Code (see below)  has worked for over a year.
It  parses image tags from specific ebay html pages.
Certain IMG tags are not "showing up" now (intermittently).
For the same html, sometimes I get all the IMG tags and
sometimes specific ones are missing.
Can't reproduce it with a browser.
Using LWP and HTML::LinkExtor to parse the img tags

I want to capture the HTML page that I'm parsing (within the same
UserAgent request) so I can see the img tag format when they don't
parse.
Have fumbled with several attempts at doing this so far.

Any assistance is most appreciated

Thanks,
Steve

  # Example url
  $itemurl =
'http://cgi.ebay.com/aw-cgi/eBayISAPI.dll?ViewItem&item=1605076637';

  @imgs = ();
  $ua = new LWP::UserAgent;

    sub callback {
    my($tag, %attr) = @_;
    return if $tag ne 'img';  # we only look closer at <img ...>
    push(@imgs, values %attr);
   }

  # Make the parser. Don't know the base yet, might be diff from
$itemurl
  $p = HTML::LinkExtor->new(\&callback);

  # Request document and parse it as it arrives
  $res = $ua->request(HTTP::Request->new(GET => $itemurl),
    sub{$p->parse($_[0])});

  # Expand all image URLs to absolute ones
  my $base = $res->base;
  @imgs = map { $_ = url($_, $base)->abs; } @imgs;




Reply via email to