Look back a the output of 'print $b->current_form()->dump();'  Do you
see where the option for 'Anthropology' appears by itself?  This is
because the HTML is not being parsed right.  The following line seems to
be the offender:

                <option value="ANT"
Name="Anthropology">Anthropology</option>

The 'Name' attribute seems to be confusing the form parser so
Anthropology is not one of the available options.  This is where
HTML::TreeBuilder comes in to help clean up problematic HTML.  I suggest
the following bit of code to clean it up:

 
$b->get("http://lca.lehman.cuny.edu/dept/registrar/schedule/coursefinder
.asp");

  my $root = HTML::TreeBuilder->new_from_content($b->content);
  my @List = $root->look_down( '_tag' => 'option' ,
          sub { $_[0]->attr('name') && $_[0]->attr('name') =~ qr/Ant/ }
);
  foreach $Element (@List) {
      $Element->attr('name', undef);
  }

The look_down method is inherited from HTML::Element.  It finds elements
which meet the given criteria.  In this case the criteria is elements
which have a tag of 'option' AND has an attribute of 'name' with a value
matching 'Ant'.  Then go through the list and get rid of the attribute.

-Dgg

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
bruce
Sent: Wednesday, June 02, 2004 10:02 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: :mechanize issues/mechanize.pm dies!!


hi..

i'm havin an issue with the following script. it uses www::mechanize and
appears to die when i use the word/var "ANT" for one of the form inputs.
the
test script demonstrates getting information from the test site using
both
LWP and mechanize. the LWP approach works, the mechanize fails.

any ideas/comments/criticisms as to why the mechanize fails would be
appreciated..

you can see the machanize works by using "CHE" instead of "ANT"...


###########################################################

#!/usr/bin/perl -w


use WWW::Mechanize;
use HTML::TreeBuilder;
use LWP::UserAgent;
   $ua = new LWP::UserAgent;
   $ua->timeout(30);
   $ua->agent("AgentName/0.1 " . $ua->agent);

   my $req = new HTTP::Request POST =>
'http://lca.lehman.cuny.edu/dept/registrar/schedule/coursefinder.asp';

   my $query;
   my $cstr;
   $cstr =
"?term=FAL04CRS&division=D&u_input=ANT&sortby=Disc%2C+Coursenum%2C+Secti
on&o
rder=+ASC";

   $req->content_type('application/x-www-form-urlencoded');
   $req->content($cstr);

   my $res = $ua->request($req);
   my $q1c = $res->content;
   print $q1c;

#
# using the above LWP method works ok...
# implementing with www::mechanize causes issues
#
# it appears that using "ANT" for "u_input" causes the
# mechanize function to run into issues. other information
# for "u_input" works ok... but "ANT" causes the app to die...
#

   my (@_semester, @_dept, @_sort, @_order1, @_order, @_div);
   my ($default_semester, $default_div, $default_order);

$default_semester="FAL04CRS";
$default_div="D";

$curdept="ANT";  ## <<<<<<< major issue!!!!!!!

$default_sort="Disc Coursenum Section";
$default_order="+ASC";

my $b = WWW::Mechanize->new();


$b->get("http://lca.lehman.cuny.edu/dept/registrar/schedule/coursefinder
.asp
");
         $b->form_number(1);
            #print $b->current_form()->dump();
         $b->field("term", $default_semester);
         $b->field("division", $default_div);
         $b->field("u_input", $curdept);
         $b->field("sortby", $default_sort);
         $b->field("order", $default_order);
         #open(F, ">out.html");
         #print F $b->submit()->content();
          #close(F);

   #      &parse_file();
   print "uuu ".$default_semester."  ".$default_div."  ".$curdept."  ".
$default_sort."  ".$default_order."\n";

die;
###########################################################


thanks...

bruce

_______________________________________________
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to