OK I've realized that the contents I want to get at are loaded dynamically
via javascript -- and that if do something like

   wget -nd -E -k -K -p http://www.nyse.com/about/listed/lc_ny_name_A.html

then I'll get a bunch of files, including the basic javascript file

   lc_ny_name_A.js

This javascript has a big hard-coded variable in it that specifies some of
the basic information that I want -- so I could try to parse this javascript
in some way to get what I'm after.

However,  if I do "save file" from firefox, I get a static .html page with
everything I really want to parse in it ...   Is there any way to simply
have wget do something like what Firefox does -- so that I can actually just
download the page _after_ the dynamic elements have been loaded and
processed? Or is my thinking on this totally wrong?

Thanks
Dan


On Sat, Feb 14, 2009 at 6:01 PM, Dan Yamins <[email protected]> wrote:

> Hi,
>
> I've been having some trouble downloading several pages with wget -- for
> instance:
>
>    http://www.nyse.com/about/listed/lc_all_name_F.html
>
> This page is be downloaded  and displayed fine by firefox, etc... but when
> I try to wget it, I only get a small piece of the page.
>
> On the one hand, it looks like what may happening is that when I look at it
> on in the browser, most of the page data only downloads after a few seconds
> of waiting -- but that wget doesn't seem to wait long enough and closes the
> download before it is done.
>
> Or is this maybe a "user-agent" issue and the website I'm trying to
> download is trying to discriminate against systematic downloads?
>
> Any help would be appreciated!
>
> best
> Dan
>
>
>
>

Reply via email to