Re: newbie question
Hi Alan! As the URL starts with https, it is a secure server. You will need to log in to this server in order to download stuff. See the manual for info how to do that (I have no experience with it). Good luck Jens (just another user) I am having trouble getting the files I want using a wildcard specifier (-A option = accept list). The following command works fine to get an individual file: wget https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/160RDTEN_FY06PB.pdf However, I cannot get all PDF files this command: wget -A *.pdf https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/ Instead, I get: Connecting to 164.224.25.30:443 . . . connected. HTTP request sent, awaiting response . . . 400 Bad Request 15:57:52 ERROR 400: Bad Request. I also tried this command without success: wget https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/*.pdf Instead, I get: HTTP request sent, awaiting response . . . 404 Bad Request 15:57:52 ERROR 404: Bad Request. I read through the manual but am still having trouble. What am I doing wrong? Thanks, Alan -- +++ NEU: GMX DSL_Flatrate! Schon ab 14,99 EUR/Monat! +++ GMX Garantie: Surfen ohne Tempo-Limit! http://www.gmx.net/de/go/dsl
Re: wget 1.10 alpha 2
Hrvoje Niksic [EMAIL PROTECTED] writes: [EMAIL PROTECTED] writes: If possible, it seems preferable to me to use the platform's C library regex support rather than make wget dependent on another library... Note that some platforms don't have library support for regexps, so we'd have to bundle anyway. Oh, and POSIX regexps don't support -- and never will -- non-greedy quantifiers, which are perhaps the most useful single additions of Perl 5 regexps. Incidentally, regex.c bundled with GNU Emacs supports them, along with non-capturing (shy) groups, another very useful feature.
RE: newbie question
Alan Thomas wrote: I am having trouble getting the files I want using a wildcard specifier... There are no options on the command line for what you're attempting to do. Neither wget nor the server you're contacting understand *.pdf in a URI. In the case of wget, it is designed to read web pages (HTML files) and then collect a list of resources that are referenced in those pages, which it then retrieves. In the case of the web server, it is designed to return individual objects on request (X.pdf or Y.pdf, but not *.pdf). Some web servers will return a list of files if you specify a directory, but you already tried that in your first use case. Try coming at this from a different direction. If you were going to manually download every PDF from that directory, how would YOU figure out the names of each one? Is there a web page that contains a list somewhere? If so, point wget there. Hope that helps. Tony PS) Jens was mistaken when he said that https requires you to log into the server. Some servers may require authentication before returning information over a secure (https) channel, but that is not a given.
Re: newbie question
Alan Thomas [EMAIL PROTECTED] writes: I am having trouble getting the files I want using a wildcard specifier (-A option = accept list). The following command works fine to get an individual file: wget https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/160RDTEN_FY06PB.pdf However, I cannot get all PDF files this command: wget -A *.pdf https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/ Instead, I get: Connecting to 164.224.25.30:443 . . . connected. HTTP request sent, awaiting response . . . 400 Bad Request 15:57:52 ERROR 400: Bad Request. Does that URL work with a browser? What version of Wget are you using? Using -d will provide a full log of what Wget is doing, as well as the responses it is getting. You can mail the log here, but please be sure it doesn't contain sensitive information (if applicable). This list is public and has public archives. Please note that you also need -r (or even better -r -l1) for -A to work the way you want it.
Re: newbie question
Tony Lewis [EMAIL PROTECTED] writes: PS) Jens was mistaken when he said that https requires you to log into the server. Some servers may require authentication before returning information over a secure (https) channel, but that is not a given. That is true. HTTPS provides encrypted communication between the client and the server, but it doesn't always imply authentication.
IPv6 on Windows
It occurred to me that the 1.10 NEWS file declares IPv6 to be supported. However, as far as I know, IPv6 doesn't work under Windows. Though it seems that Winsock 2 (which mswindows.h is apparently trying to support) implements IPv6, I have a nagging suspicion that just including winsock2.h and defining HAVE_GETADDRINFO and ENABLE_IPV6 won't be enough[1]. So what does it take to support IPv6 under Windows? If this is not possible in the 1.10 time frame, we should probably note in NEWS that IPv6 is still not supported under Windows. [1] On the other hand, the 1.7 patch available at http://win6.jp/Wget/wget-1.7-win32-v6-20010716a-2.zip suggests exactly that.
Re: Troubles with mirroring.
I only want to mirror a web site, which content is on two different servers (different domains), and do not want to mirror parent directories. It seems that it would be good if -np option would work for settings in the -I option and/or in -D options should be possible to put not only domains but domains with paths to directories. Because there are no replies I assume that Wget cannot do it. So, please, answer at least to that question now: will you enhance/modify Wget somehow, so that in the next release it could do it? Cheers! ak
Re: Troubles with mirroring.
Andrzej [EMAIL PROTECTED] writes: So, please, answer at least to that question now: will you enhance/modify Wget somehow, so that in the next release it could do it? The next release is in the feature freeze, so it will almost certainly not support this feature. However, IMHO it makes a lot of sense to augment -I/-D with paths. I've never been really satisfied with the interaction of -D and -np anyway.
Re: Troubles with mirroring.
Andrzej [EMAIL PROTECTED] writes: The next release is in the feature freeze, so it will almost certainly not support this feature. However, IMHO it makes a lot of sense to augment -I/-D with paths. I've never been really satisfied with the interaction of -D and -np anyway. Can you estimate when and in what version it will be available? Unfortunately not. Wget is run by volunteers, and if this extension is not interesting enough for a programmer to pick it up, it won't get done. The best way to make it happen is to write the code yourself, or convince/hire a programmer do it for you.
Re: Troubles with mirroring.
Unfortunately not. Wget is run by volunteers, and if this extension is not interesting enough for a programmer to pick it up, it won't get done. I think it is very interesting and first of all useful. Without it the real, smooth, fully automatic mirroring of web sites, which have their content on several servers/domains is just not possible. That's a huge disadvantage IMHO, that Wget cannot do it. The best way to make it happen is to write the code yourself, or I would if I could. :( convince/hire a programmer do it for you. And that's exactly what I'm trying to do here on the list. :) Ideas are even more important that just coding, because without ideas there would be no coding. :P So I'm giving you some ideas, and someone skilled in coding (maybe you as the inventor of Wget? :)) would write the code. Well, I do hope so. Cheers! ak
Re: wget 1.10 alpha 2
On Wednesday 13 April 2005 07:39 am, Herold Heiko wrote: With MS Visual Studio 6 still needs attached patch in order to compile (disable optimization for part of http.c and retr.c if cl.exe version =12). Windows msvc test binary at http://xoomer.virgilio.it/hherold/ hi herold, the patch you've posted is really such an ugly workaround (shame on microsoft and their freaking compilers) that i am not very willing to merge it into our cvs. have you tried the microsoft visual c++ toolkit 2003? maybe it works. you can download it for free at the following URL: http://msdn.microsoft.com/visualc/vctoolkit2003/ -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi University of Ferrara - Dept. of Eng.http://www.ing.unife.it Institute of Human Machine Cognition http://www.ihmc.us Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: newbie question
Hi! Yes, I see now, I misread Alan's original post. I thought he would not even be able to download the single .pdf. Don't know why, as he clearly said it works getting a single pdf. Sorry for the confusion! Jens Tony Lewis [EMAIL PROTECTED] writes: PS) Jens was mistaken when he said that https requires you to log into the server. Some servers may require authentication before returning information over a secure (https) channel, but that is not a given. That is true. HTTPS provides encrypted communication between the client and the server, but it doesn't always imply authentication. -- +++ GMX - Die erste Adresse für Mail, Message, More +++ 1 GB Mailbox bereits in GMX FreeMail http://www.gmx.net/de/go/mail
Re: IPv6 on Windows
On Thu, 14 Apr 2005, Hrvoje Niksic wrote: It occurred to me that the 1.10 NEWS file declares IPv6 to be supported. However, as far as I know, IPv6 doesn't work under Windows. Though it seems that Winsock 2 (which mswindows.h is apparently trying to support) implements IPv6, I have a nagging suspicion that just including winsock2.h and defining HAVE_GETADDRINFO and ENABLE_IPV6 won't be enough[1]. So what does it take to support IPv6 under Windows? I've been working on the mingw port, using configure and the -mno-cygwin switch under cygwin (rather than the batch file that comes with the wget distribution). At least with the patches I made to allow this to configure and compile, IPV6 is not supported. I may have some time this weekend to see if I can get IPV6 working, at least for the mingw windows port. At the moment, the patch for mingw is combined with my DOS patch. I'll try to separate the mingw parts out, so that mingw changes can be considered separately from the DOS changes. Doug -- Doug Kaufman Internet: [EMAIL PROTECTED]
Re: [unclassified] Re: newbie question
I got the wgetgui program, and used it successfully. The commands were very much like this one. Thanks, Alan - Original Message - From: Technology Freak [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Sent: Thursday, April 14, 2005 10:12 AM Subject: [unclassified] Re: newbie question Alan, You could try something like this wget -r -d -l1 -H -t1 -nd -N -np -A pdf URL On Wed, 13 Apr 2005, Alan Thomas wrote: Date: Wed, 13 Apr 2005 16:02:40 -0400 From: Alan Thomas [EMAIL PROTECTED] To: wget@sunsite.dk Subject: newbie question I am having trouble getting the files I want using a wildcard specifier (-A option = accept list). The following command works fine to get an individual file: wget https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/160RDTEN_FY06PB.pdf However, I cannot get all PDF files this command: wget -A *.pdf https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/ --- TekPhreak [EMAIL PROTECTED] http://www.tekphreak.com
Build problem: ptimer.c (CVS 1.7), gcc 3.4.3, Tru64 UNIX V5.1B
urt# gcc -v Reading specs from /usr/local/lib/gcc/alpha-dec-osf5.1/3.4.3/specs Configured with: /usr1/local/gnu/gcc-3.4.3/configure Thread model: posix gcc version 3.4.3 urt# sizer -v Compaq Tru64 UNIX V5.1B (Rev. 2650); Thu Mar 6 19:03:28 CST 2003 [...] gcc -I. -I. -I/opt/include -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wg etrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c ptimer .c ptimer.c:95:20: operator '' has no left operand [...] The offending code (line 95) is: # if _POSIX_TIMERS 0 There's no left operand because: urt# grep POSIX_TIMERS /usr/include/*.h /usr/include/unistd.h:#define _POSIX_TIMERS Is there any reason that # ifdef _POSIX_TIMERS would be worse? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
RFC conflict for RANGE
for wget -c, a range is specified, ie. GET /dubai.jpg HTTP/1.0 User-Agent: Wget/1.9.1 Host: localhost:4400 Accept: */* Connection: Keep-Alive Range: bytes=40- However, HTTP/1.0 (RFC 1945) does not specify a range although it is in 14.35.1 of HTTP/1.1. Didn't know whether you cared about it, but yeah, I consider it a bug. Sincerely, Christopher J. McKenzie