Re: [Bug-wget] CNET download links not working with WGET

2011-06-08 Thread Jeff Givens
No, it's not working.  It downloads part of the URL and creates a file 
named 3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1 
which is 68 KB.  I cannot wget to treat the string of characters as a 
whole URL.  Please help, I really need to get this script working and 
the only place to download this file is from CNET.



So... looks like it works, then. Your command shell isn't complaining
about weird command names, wget is clearly requesting the full and
correct URL, it follows redirections, and saves using the final
redirection URL (the latest sources wouldn't follow that last step -
it'd save using the request URI by default).

If you dislike the filename, then provided you have a recent enough
version of wget you can add the --content-disposition option if the
server provides a rename header (Content-Disposition); or else use -E
to have wget force the file name to end in .html

-mjc

(05/26/2011 12:19 PM), Jeff Givens wrote:

Hi, I know this is an older topic but thanks for replying.  I forgot to
mention I had already what you listed below and this is the output I get:

C:\DOWNLOADwget http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_
4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dl

textltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAA

BuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3

Fspi%3D077d9109e846975d0db9532bd610588f
--2011-05-02 12:34:20--
http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_4
-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dlt

extltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAAB

uImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3F

spi%3D077d9109e846975d0db9532bd610588f
Resolving dw.com.com... 216.239.113.95
Connecting to dw.com.com|216.239.113.95|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location:
http://download.cnet.com/3001-8022_4-10804572.html?spi=077d9109e846975
d0db9532bd610588f [following]
--2011-05-02 12:34:21--
http://download.cnet.com/3001-8022_4-10804572.html?spi=
077d9109e846975d0db9532bd610588f
Resolving download.cnet.com... 64.30.224.58
Connecting to download.cnet.com|64.30.224.58|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to:
`3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1'

 [= ] 69,240  77.3K/s   in 0.9s

2011-05-02 12:34:22 (77.3 KB/s) -
`3001-8022_4-10804572.html@spi=077d9109e846975
d0db9532bd610588f.1' saved [69240]


C:\DOWNLOAD

Thanks for your help.

-Jeff




hello,

the  character in the url is interpreted by your shell.

Try using something like:

wget URL

Cheers,
Giuseppe



Jeff Givensj...@sds.net   writes:


Hello, I am having an issue downloading files via download links from
CNET.  It appears to locate some of the URL but stops at the first
siteId part.  I have included the debug information as well.  Thanks
in advance for your help.

C:\DOWNLOAD\wget http://dw.com.com/redir?edId=3siteId=4oId=300
0-8022_4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag

=tdw_dltextltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLw

oOYJQAABuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572

.html%3Fspi%3D077d9109e846975d0db9532bd610588f
--2011-04-19 11:30:35-- http://dw.com.com/redir?edId=3
Resolving dw.com.com... 216.239.113.95
Connecting to dw.com.com|216.239.113.95|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://dw.com.com/redir/redx/?edId=3 [following]
--2011-04-19 11:30:36-- http://dw.com.com/redir/redx/?edId=3
Reusing existing connection to dw.com.com:80.
HTTP request sent, awaiting response... 404 Not Found
2011-04-19 11:30:36 ERROR 404: Not Found.

'siteId' is not recognized as an internal or external command,
operable program or batch file.
'oId' is not recognized as an internal or external command,
operable program or batch file.
'ontId' is not recognized as an internal or external command,
operable program or batch file.
'spi' is not recognized as an internal or external command,
operable program or batch file.
'lop' is not recognized as an internal or external command,
operable program or batch file.
'tag' is not recognized as an internal or external command,
operable program or batch file.
'ltype' is not recognized as an internal or external command,
operable program or batch file.
'pid' is not recognized as an internal or external command,
operable program or batch file.
'mfgId' is not recognized as an internal or external command,
operable program or batch file.
'merId' is not recognized as an internal or external command,
operable program or batch file.
'pguid' is not recognized as an internal or external command,
operable program or batch file.
'destUrl' is not recognized as an internal or external command,
operable program or 

Re: [Bug-wget] CNET download links not working with WGET

2011-06-08 Thread Micah Cowan
If you read the most recent output of wget that you gave (after quoting 
the URL), it _does_ treat the string of characters as a whole URL. The 
server redirects it to a shorter URL. If I enter that same URL into a 
browser, it does the same redirection there, and results in an HTML 
page, just like what wget gets. That page seems to have some JavaScript 
or something that initiates a separate download of something else; I 
suppose that something else is what you wanted. As you may know, wget 
doesn't execute JavaScript code from a webpage, so you'll need to find 
the real URL to the thing you wanted to download, and feed that to wget.


-mjc

On 06/08/2011 09:38 AM, Jeff Givens wrote

No, it's not working. It downloads part of the URL and creates a file
named 3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1
which is 68 KB. I cannot wget to treat the string of characters as a
whole URL. Please help, I really need to get this script working and the
only place to download this file is from CNET.


So... looks like it works, then. Your command shell isn't complaining
about weird command names, wget is clearly requesting the full and
correct URL, it follows redirections, and saves using the final
redirection URL (the latest sources wouldn't follow that last step -
it'd save using the request URI by default).

If you dislike the filename, then provided you have a recent enough
version of wget you can add the --content-disposition option if the
server provides a rename header (Content-Disposition); or else use -E
to have wget force the file name to end in .html

-mjc

(05/26/2011 12:19 PM), Jeff Givens wrote:

Hi, I know this is an older topic but thanks for replying. I forgot to
mention I had already what you listed below and this is the output I
get:

C:\DOWNLOADwget http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_
4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dl


textltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAA


BuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3


Fspi%3D077d9109e846975d0db9532bd610588f
--2011-05-02 12:34:20--
http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_4
-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dlt


extltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAAB


uImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3F


spi%3D077d9109e846975d0db9532bd610588f
Resolving dw.com.com... 216.239.113.95
Connecting to dw.com.com|216.239.113.95|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location:
http://download.cnet.com/3001-8022_4-10804572.html?spi=077d9109e846975
d0db9532bd610588f [following]
--2011-05-02 12:34:21--
http://download.cnet.com/3001-8022_4-10804572.html?spi=
077d9109e846975d0db9532bd610588f
Resolving download.cnet.com... 64.30.224.58
Connecting to download.cnet.com|64.30.224.58|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to:
`3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1'

[= ] 69,240 77.3K/s in 0.9s

2011-05-02 12:34:22 (77.3 KB/s) -
`3001-8022_4-10804572.html@spi=077d9109e846975
d0db9532bd610588f.1' saved [69240]


C:\DOWNLOAD

Thanks for your help.

- Jeff




Re: [Bug-wget] CNET download links not working with WGET

2011-06-08 Thread Jeff Givens
Micah, thanks for your help.  That was the piece that I was missing.  I 
didn't realize it was re-directing to another site.  I was able to find 
out the other site it was going to, download the executable and then I 
just put in a command to re-name the exe file since it was named the 
URL.  Thanks again for your help.


If you read the most recent output of wget that you gave (after 
quoting the URL), it _does_ treat the string of characters as a whole 
URL. The server redirects it to a shorter URL. If I enter that same 
URL into a browser, it does the same redirection there, and results in 
an HTML page, just like what wget gets. That page seems to have some 
JavaScript or something that initiates a separate download of 
something else; I suppose that something else is what you wanted. As 
you may know, wget doesn't execute JavaScript code from a webpage, so 
you'll need to find the real URL to the thing you wanted to download, 
and feed that to wget.


-mjc

On 06/08/2011 09:38 AM, Jeff Givens wrote

No, it's not working. It downloads part of the URL and creates a file
named 3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1
which is 68 KB. I cannot wget to treat the string of characters as a
whole URL. Please help, I really need to get this script working and the
only place to download this file is from CNET.


So... looks like it works, then. Your command shell isn't complaining
about weird command names, wget is clearly requesting the full and
correct URL, it follows redirections, and saves using the final
redirection URL (the latest sources wouldn't follow that last step -
it'd save using the request URI by default).

If you dislike the filename, then provided you have a recent enough
version of wget you can add the --content-disposition option if the
server provides a rename header (Content-Disposition); or else use -E
to have wget force the file name to end in .html

-mjc

(05/26/2011 12:19 PM), Jeff Givens wrote:

Hi, I know this is an older topic but thanks for replying. I forgot to
mention I had already what you listed below and this is the output I
get:

C:\DOWNLOADwget 
http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_
4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dl 




textltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAA 




BuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3 




Fspi%3D077d9109e846975d0db9532bd610588f
--2011-05-02 12:34:20--
http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_4
-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dlt 




extltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAAB 




uImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3F 




spi%3D077d9109e846975d0db9532bd610588f
Resolving dw.com.com... 216.239.113.95
Connecting to dw.com.com|216.239.113.95|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location:
http://download.cnet.com/3001-8022_4-10804572.html?spi=077d9109e846975
d0db9532bd610588f [following]
--2011-05-02 12:34:21--
http://download.cnet.com/3001-8022_4-10804572.html?spi=
077d9109e846975d0db9532bd610588f
Resolving download.cnet.com... 64.30.224.58
Connecting to download.cnet.com|64.30.224.58|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to:
`3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1'

[= ] 69,240 77.3K/s in 0.9s

2011-05-02 12:34:22 (77.3 KB/s) -
`3001-8022_4-10804572.html@spi=077d9109e846975
d0db9532bd610588f.1' saved [69240]


C:\DOWNLOAD

Thanks for your help.

- Jeff


Re: [Bug-wget] CNET download links not working with WGET

2011-05-26 Thread Micah Cowan
So... looks like it works, then. Your command shell isn't complaining
about weird command names, wget is clearly requesting the full and
correct URL, it follows redirections, and saves using the final
redirection URL (the latest sources wouldn't follow that last step -
it'd save using the request URI by default).

If you dislike the filename, then provided you have a recent enough
version of wget you can add the --content-disposition option if the
server provides a rename header (Content-Disposition); or else use -E
to have wget force the file name to end in .html

-mjc

(05/26/2011 12:19 PM), Jeff Givens wrote:
 Hi, I know this is an older topic but thanks for replying.  I forgot to
 mention I had already what you listed below and this is the output I get:
 
 C:\DOWNLOADwget http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_
 4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dl
 
 textltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAA
 
 BuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3
 
 Fspi%3D077d9109e846975d0db9532bd610588f
 --2011-05-02 12:34:20-- 
 http://dw.com.com/redir?edId=3siteId=4oId=3000-8022_4
 -10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag=tdw_dlt
 
 extltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLwoOYJQAAB
 
 uImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3F
 
 spi%3D077d9109e846975d0db9532bd610588f
 Resolving dw.com.com... 216.239.113.95
 Connecting to dw.com.com|216.239.113.95|:80... connected.
 HTTP request sent, awaiting response... 302 Found
 Location:
 http://download.cnet.com/3001-8022_4-10804572.html?spi=077d9109e846975
 d0db9532bd610588f [following]
 --2011-05-02 12:34:21-- 
 http://download.cnet.com/3001-8022_4-10804572.html?spi=
 077d9109e846975d0db9532bd610588f
 Resolving download.cnet.com... 64.30.224.58
 Connecting to download.cnet.com|64.30.224.58|:80... connected.
 HTTP request sent, awaiting response... 200 OK
 Length: unspecified [text/html]
 Saving to:
 `3001-8022_4-10804572.html@spi=077d9109e846975d0db9532bd610588f.1'
 
 [ =] 69,240  77.3K/s   in 0.9s
 
 2011-05-02 12:34:22 (77.3 KB/s) -
 `3001-8022_4-10804572.html@spi=077d9109e846975
 d0db9532bd610588f.1' saved [69240]
 
 
 C:\DOWNLOAD
 
 Thanks for your help.
 
 -Jeff
 
 
 
 hello,

 the  character in the url is interpreted by your shell.

 Try using something like:

 wget URL

 Cheers,
 Giuseppe



 Jeff Givensj...@sds.net  writes:

 Hello, I am having an issue downloading files via download links from
 CNET.  It appears to locate some of the URL but stops at the first
 siteId part.  I have included the debug information as well.  Thanks
 in advance for your help.

 C:\DOWNLOAD\wget http://dw.com.com/redir?edId=3siteId=4oId=300
 0-8022_4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag

 =tdw_dltextltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLw

 oOYJQAABuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572

 .html%3Fspi%3D077d9109e846975d0db9532bd610588f
 --2011-04-19 11:30:35-- http://dw.com.com/redir?edId=3
 Resolving dw.com.com... 216.239.113.95
 Connecting to dw.com.com|216.239.113.95|:80... connected.
 HTTP request sent, awaiting response... 302 Found
 Location: http://dw.com.com/redir/redx/?edId=3 [following]
 --2011-04-19 11:30:36-- http://dw.com.com/redir/redx/?edId=3
 Reusing existing connection to dw.com.com:80.
 HTTP request sent, awaiting response... 404 Not Found
 2011-04-19 11:30:36 ERROR 404: Not Found.

 'siteId' is not recognized as an internal or external command,
 operable program or batch file.
 'oId' is not recognized as an internal or external command,
 operable program or batch file.
 'ontId' is not recognized as an internal or external command,
 operable program or batch file.
 'spi' is not recognized as an internal or external command,
 operable program or batch file.
 'lop' is not recognized as an internal or external command,
 operable program or batch file.
 'tag' is not recognized as an internal or external command,
 operable program or batch file.
 'ltype' is not recognized as an internal or external command,
 operable program or batch file.
 'pid' is not recognized as an internal or external command,
 operable program or batch file.
 'mfgId' is not recognized as an internal or external command,
 operable program or batch file.
 'merId' is not recognized as an internal or external command,
 operable program or batch file.
 'pguid' is not recognized as an internal or external command,
 operable program or batch file.
 'destUrl' is not recognized as an internal or external command,
 operable program or batch file.

 DEBUG output created by Wget 1.11.4 on Windows-MSVC.

 --2011-04-19 11:27:09-- http://dw.com.com/redir?edId=3
 Resolving dw.com.com... seconds 0.00, 64.30.224.42
 Caching dw.com.com =  64.30.224.42
 Connecting to 

Re: [Bug-wget] CNET download links not working with WGET

2011-04-23 Thread Giuseppe Scrivano
hello,

the  character in the url is interpreted by your shell.

Try using something like:

wget URL

Cheers,
Giuseppe



Jeff Givens j...@sds.net writes:

 Hello, I am having an issue downloading files via download links from
 CNET.  It appears to locate some of the URL but stops at the first
 siteId part.  I have included the debug information as well.  Thanks
 in advance for your help.

 C:\DOWNLOAD\wget http://dw.com.com/redir?edId=3siteId=4oId=300
 0-8022_4-10804572ontId=8022_4spi=077d9109e846975d0db9532bd610588flop=linktag
 =tdw_dltextltype=dl_dlnowpid=11665648mfgId=6290020merId=6290020pguid=HFsQLw
 oOYJQAABuImQcAAAGmdestUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572
 .html%3Fspi%3D077d9109e846975d0db9532bd610588f
 --2011-04-19 11:30:35-- http://dw.com.com/redir?edId=3
 Resolving dw.com.com... 216.239.113.95
 Connecting to dw.com.com|216.239.113.95|:80... connected.
 HTTP request sent, awaiting response... 302 Found
 Location: http://dw.com.com/redir/redx/?edId=3 [following]
 --2011-04-19 11:30:36-- http://dw.com.com/redir/redx/?edId=3
 Reusing existing connection to dw.com.com:80.
 HTTP request sent, awaiting response... 404 Not Found
 2011-04-19 11:30:36 ERROR 404: Not Found.

 'siteId' is not recognized as an internal or external command,
 operable program or batch file.
 'oId' is not recognized as an internal or external command,
 operable program or batch file.
 'ontId' is not recognized as an internal or external command,
 operable program or batch file.
 'spi' is not recognized as an internal or external command,
 operable program or batch file.
 'lop' is not recognized as an internal or external command,
 operable program or batch file.
 'tag' is not recognized as an internal or external command,
 operable program or batch file.
 'ltype' is not recognized as an internal or external command,
 operable program or batch file.
 'pid' is not recognized as an internal or external command,
 operable program or batch file.
 'mfgId' is not recognized as an internal or external command,
 operable program or batch file.
 'merId' is not recognized as an internal or external command,
 operable program or batch file.
 'pguid' is not recognized as an internal or external command,
 operable program or batch file.
 'destUrl' is not recognized as an internal or external command,
 operable program or batch file.

 DEBUG output created by Wget 1.11.4 on Windows-MSVC.

 --2011-04-19 11:27:09-- http://dw.com.com/redir?edId=3
 Resolving dw.com.com... seconds 0.00, 64.30.224.42
 Caching dw.com.com = 64.30.224.42
 Connecting to dw.com.com|64.30.224.42|:80... seconds 0.00, connected.
 Created socket 340.
 Releasing 0x01411158 (new refcount 1).

 ---request begin---
 GET /redir?edId=3 HTTP/1.0

 User-Agent: Wget/1.11.4

 Accept: */*

 Host: dw.com.com

 Connection: Keep-Alive



 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 302 Found

 Date: Tue, 19 Apr 2011 15:27:26 GMT

 Server: Apache/2.0

 Pragma: no-cache

 Cache-control: no-cache, must-revalidate, no-transform

 Vary: *

 Expires: Fri, 23 Jan 1970 12:12:12 GMT

 Set-Cookie: XCLGFbrowser=Cg5iVk2tqd6J8Sg; expires=Sun, 18-Apr-2021
 15:27:26 GMT; domain=.com.com; path=/

 Location: http://dw.com.com/redir/redx/?edId=3

 Content-Length: 0

 P3P: CP=CAO DSP COR CURa ADMa DEVa PSAa PSDa IVAi IVDi CONi OUR OTRi
 IND PHY ONL UNI FIN COM NAV INT DEM STA

 Keep-Alive: timeout=363, max=760

 Connection: Keep-Alive

 Content-Type: text/plain



 ---response end---
 302 Found
 Registered socket 340 for persistent reuse.
 cdm: 1 2 3 4 5 6 7 8
 Stored cookie com.com -1 (ANY) / permanent insecure [expiry
 2021-04-18 11:27:26] XCLGFbrowser Cg5iVk2tqd6J8Sg
 Location: http://dw.com.com/redir/redx/?edId=3 [following]
 Skipping 0 bytes of body: [] done.
 --2011-04-19 11:27:09-- http://dw.com.com/redir/redx/?edId=3
 Reusing existing connection to dw.com.com:80.
 Reusing fd 340.

 ---request begin---
 GET /redir/redx/?edId=3 HTTP/1.0

 User-Agent: Wget/1.11.4

 Accept: */*

 Host: dw.com.com

 Connection: Keep-Alive

 Cookie: XCLGFbrowser=Cg5iVk2tqd6J8Sg



 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 404 Not Found

 Date: Tue, 19 Apr 2011 15:27:26 GMT

 Server: Apache/2.0

 Content-Length: 209

 Keep-Alive: timeout=363, max=779

 Connection: Keep-Alive

 Content-Type: text/html; charset=iso-8859-1



 ---response end---
 404 Not Found
 Skipping 209 bytes of body: [!DOCTYPE HTML PUBLIC -//IETF//DTD HTML
 2.0//EN
 htmlhead
 title404 Not Found/title
 /headbody
 h1Not Found/h1
 pThe requested URL /redir/redx/ was not found on this server./p
 /body/html
 ] done.
 2011-04-19 11:27:09 ERROR 404: Not Found.