Re: No downloading

2008-06-29 Thread mm w
the default index is not named index, or there is a HTTP test
server/side regarding HTTP_USER_AGENT

On Sun, Jun 29, 2008 at 1:42 PM, Mishari Almishari [EMAIL PROTECTED] wrote:
 Hi,
 I want to download the website www.2006election.net

 For that, I used the command
 wget -d -nd -p -E -H -k -K -S -R png,gif,jpg,bmp,ico  --ignore-length
 --user-agent=Mozilla -e robots=off -P www.2006election.net -o
 www.2006election.net.out  http://www.2006election.net;

 But the downloaded page index.html has no content (except body/head tags),
 eventhough i can see the content when i used internet exprolorer.

 Any Clue!

 Thanks in advance!

 -mish



-- 
-mmw


Re: No downloading

2008-06-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 On Sun, Jun 29, 2008 at 1:42 PM, Mishari Almishari [EMAIL PROTECTED] wrote:
 Hi,
 I want to download the website www.2006election.net

 For that, I used the command
 wget -d -nd -p -E -H -k -K -S -R png,gif,jpg,bmp,ico  --ignore-length
 --user-agent=Mozilla -e robots=off -P www.2006election.net -o
 www.2006election.net.out  http://www.2006election.net;

 But the downloaded page index.html has no content (except body/head tags),
 eventhough i can see the content when i used internet exprolorer.

mm w wrote:
 the default index is not named index, or there is a HTTP test
 server/side regarding HTTP_USER_AGENT

The first one could not possibly cause problems, since he's not
requesting any URLs with index.html in them.

The HTTP_USER_AGENT thing is the problem. Mishari tried to specifically
handle this with the --user-agent line, but it apparently wasn't
convincing enough. I got it to work with:

  --user-agent='Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET
CLR 1.1.4322)'

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIaAOl7M8hyUobTrERAhldAJ9Ivi2zEQ5MZQ1fIdResHqPDhtnuACgj1Y+
kNGIgq2MS8tPXxkXoKpNVPw=
=IhL+
-END PGP SIGNATURE-


Re: No downloading

2008-06-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Petr Pisar wrote:
 On 2008-06-29, Mishari Almishari [EMAIL PROTECTED] wrote:
 Hi,
 I want to download the website
 www.2006election.nethttp://www.2006election.net.out/
 

 But the downloaded page index.html has no content (except body/head tags),
 eventhough i can see the content when i used internet exprolorer.

 This is not bug, that's feature. All the content you see in IE is
 generated by JavaScript. See source code of the web page in IE.

No, the command he gives literally yields a completely empty web page:

html
body
/body
/html

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIaAcD7M8hyUobTrERAmyUAJ0XSHavTRur8J0eMfk4CY/Ck4p+ngCfa+gU
mPn+vwgASK5iPH2J2WTtpWI=
=21dD
-END PGP SIGNATURE-


Handling Ajax (was Re: No downloading)

2008-06-29 Thread Paul King
I just want to de-lurk for a minute. I have been using wget on a regular basis 
for various websites. 

If Javascript is responsible for writing the content, then you have a web page 
that probably uses AJAX, and would be dyanmically updateable. Since Ajax use is 
on the rise, I wonder if anyone here can say how does wget deal with sites 
using Ajax?

Paul King

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Petr Pisar wrote:
  On 2008-06-29, Mishari Almishari [EMAIL PROTECTED] wrote:
  Hi,
  I want to download the website
  www.2006election.nethttp://www.2006election.net.out/
  
 
  But the downloaded page index.html has no content (except body/head
 tags),
  eventhough i can see the content when i used internet exprolorer.
 
  This is not bug, that's feature. All the content you see in IE is
  generated by JavaScript. See source code of the web page in IE.
 
 No, the command he gives literally yields a completely empty web page:
 
 html
 body
 /body
 /html
 
 - --
 Micah J. Cowan
 Programmer, musician, typesetting enthusiast, gamer,
 and GNU Wget Project Maintainer.
 http://micah.cowan.name/
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
 
 iD8DBQFIaAcD7M8hyUobTrERAmyUAJ0XSHavTRur8J0eMfk4CY/Ck4p+ngCfa+gU
 mPn+vwgASK5iPH2J2WTtpWI=
 =21dD
 -END PGP SIGNATURE-
 
 __ NOD32 3225 (20080629) Information __
 
 This message was checked by NOD32 antivirus system.
 http://www.eset.com
 
 




Re: Handling Ajax (was Re: No downloading)

2008-06-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Paul King wrote:
 I just want to de-lurk for a minute. I have been using wget on a regular 
 basis 
 for various websites. 
 
 If Javascript is responsible for writing the content, then you have a web 
 page 
 that probably uses AJAX, and would be dyanmically updateable. Since Ajax use 
 is 
 on the rise, I wonder if anyone here can say how does wget deal with sites 
 using Ajax?

Not so well, generally speaking. Wget isn't going to do any
JavaScript-interpreting on it's own, so it really depends. If the
JavaScript was written in certain ways, it's possible it will just
magically work when you fire it up in your browser. It's not unlikely
that it fails miserably. :\

Ultimately, I think it depends on the site.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIaBoR7M8hyUobTrERAszlAJ9nf8WyaMYFuu2+hNgn8hLCfBzMBgCdGAZL
DD0EfFfeyCxV7MiRw8eVHMs=
=LGpk
-END PGP SIGNATURE-