hello perl users,  today i am struggling to better understand what the
underlying issue is here.

my employer uses a web based CMS system called 'salesforce.com'.

using a web browser, i log into this site, i then use its web interface
 and eventually display a 'customer record' web page in my browser.  some
of the elements i see on the page include 'products they own', 'contacts',
'recent activities', etc...

if i;

a. using the web browser, 'view page source' i can actually see/read the
ascii html code fragments that clearly list the customers contact names,
email addresses, etc...  in other words, what i see on the screen of my
rendered web page (in the browser), im able to read the full underlying
html code fragments and data fragments when 'viewing page source'.

b. save as the currently viewed web page to my local hard drive, and i open
the resulting *.htm file with a text editor, i am once again able to see
all elements, to include the customers contact names and their emails.

now i had the thought, using LWP::Get i should be able to simply get this
same URL thats presented in my web browser.

my first test.

1. i copied this URL and opened a new tab in my web browser that already
had a login session going with salesforce.com, in the address field of the
new tab, i pasted the URL, then hit the 'enter' key.  viola, the exact same
page (as was currently displayed on a different tab in my browser) also
displayed equally well with all data being displayed.

2. i then wrote an LWP::Get script where i pasted the exact same URL and
ran my script.  my one liner...

my $HTTP_response_code = LWP::Simple::mirror $url, 'test000.htm';
print $HTTP_response_code;

shows a status code of 200 (page retrieved), and resulted in a file
'test000.htm' being written into the cwd.  however, when i view the
contents of the file saved, its nothing close to 'browsers - view page
source' or to the contents of a web page saved locally from within the web
browser.

my only guess here is that perhaps some elements of the page are
dynamically created via javascript, or other client browser technology -
which would be lacking from LWP::Get.

if that is the reason, does anyone know if there is a notion of 'simulating
a browser' via a Perl script so i could do more than use HTTP get, but
instead simulate the full function of what a 'normal browser' would do to
essentially create the full contents of a page using JavaScript, so that
when i then save the contents of the page to a file to evaluate, its got
all dynamic content in place, and nothign is missing.

perhaps there are other reasons why i am getting this behavior.  didnt know
if any others have tried hacking at web pages in this manner before and
might have had a similar experience.

greg
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to