Re: Back after a while
From: Hrvoje Niksic [EMAIL PROTECTED] The other function arguments control various formatting options. (Where can't GCC printf() using %ll?) For the record, GCC doesn't printf() anything, printf is defined in the standard library. If the operating system's printf() doesn't support %ll, it will not work in GCC either. I thought there was a GCC run-time library for this stuff, but perhaps a better question would have been, 'Where can't a GCC user do a printf() using %ll?'. Or %something. The variability of the something was what drove the Info-ZIP code to use the annoying (but portable, given enough #ifdef's) fzofft() function. It certainly seems the InfoZip developers have paid a great deal of attention to LFS and portability. Thanks for the tip. They (we?) are more VMS-friendly, too. (But you're welcome to visit http://antinode.org/docs/dec/sw/wget.html and glean what you can.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Back after a while
From: Hrvoje Niksic [EMAIL PROTECTED] but perhaps a better question would have been, 'Where can't a GCC user do a printf() using %ll?'. On any system that predates `long long'. For example, SunOS 4.1.x, Ultrix, etc. I thought we were discussing changes for large-file support. Perhaps I'm seeing things through my Info-ZIP filters, where the programs work as well as they can with the OS features which are available, and large files are the only thing demanding 64-bit integers. While I'm aware of some 64-bit systems which lacked large-file support, I'm not aware of any environment which offers large-file support and lacks adequate support for 64-bit integers. (Nor would I care much about it if there were one.) SMS.
Re: Large file support (was Re: Back after a while)
Is the source for Zip 3.0/UnZip 6.0 publicaly available? I believe that some relatively recent beta code (Zip 3.0d, UnZip 6.0b) is available under: ftp://ftp.info-zip.org/pub/infozip/OLD/beta/ and probably various mirrors around the world. You might wish to start at http://www.info-zip.org/Zip.html#Sources, choose a host, then look for the new stuff there. Zip 3.0e and UnZip 6.0c (still beta) are in the works (getting closer, but with no firm date), but I believe that the large-file code had largely settled down in what you can find now. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Large file support (was: Back after a while)
It's not my program (obviously), but: 1. I'd say that code like if ( sizeof(number) == 8 ) should have been a compile-time #ifdef rather than a run-time decision. 2. Multiple functions like print_number_as_string() and print_second_number_as_string() (and so on?) look like a real pain to use. The Info-ZIP code uses one function with a ring of string buffers to ease the load on the programmer. So long as you don't put too many calls into the same printf(), it's pretty painless. 3. print_second_number_as_string()? Are you sure the names are long enough? (VMS C (by default) truncates externals longer than 31 characters, so I worry about these things even if no one else objects to typing for days on one statement.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Large file support
From: Hrvoje Niksic [...] It is no small task to study Info-ZIP's source code. I did plan to look at it later, but at the time it was quicker to just ask. It's fairly easy to SEARCH [...]*.c, *.h LARGE_FILE_SUPPORT (or your local find/grep equivalent). Locating zip_fzofft would also be pretty easy. Anyway, why the unfriendliness? [...] Not unfriendly, just a bit frustrated. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: O_EXCL and large files
From: Hrvoje Niksic: [...] Solaris, on the other hand, seems to use the open64 function. (But will open be automatically mapped to open64 when _FILE_OFFSET_BITS is 64?) For all I know, other systems may require something different. SunOS 5.9 /usr/include/fcntl.h: [...] /* large file compilation environment setup */ #if !defined(_LP64) _FILE_OFFSET_BITS == 64 #ifdef __PRAGMA_REDEFINE_EXTNAME #pragma redefine_extname openopen64 #pragma redefine_extname creat creat64 [...] The idea is to continue to use open(), but to define the right macros to get the right open(). On VMS (with its RMS I/O layer), open() is not so fundamental, and only functions which explicitly use off_t seem to be affected. And yes, VMS does require something different. The macro there is _LARGEFILE. (But that's a problem for config.h and/or the builders.) A quick look at Tru64 UNIX suggests that large-file is all there is. I'd say that if it fails on Linux, then Linux has a problem. (But I'd expect it to be fine if you do it correctly.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
wget 1.10 alpha 1
What would it take to get my VMS changes into the main code stream? http://antinode.org/dec/sw/wget.html Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget 1.10 alpha 1
From: Mauro Tortonesi [EMAIL PROTECTED] [...] i think that if you want your patches to be merged in our CVS, you should follow the official patch submission procedure (that is, posting your patches to the wget-patches AT sunsite DOT dk mailing list. each post should include a brief comment about what the patch does, and especially why it does so). this would save a lot of time to me and hrvoje and would definitely speed up the merging process. [...] Perhaps. I'll give it a try. Also, am I missing something obvious, or should the configure script (as in, To configure Wget, run the configure script provided with the distribution.) be somewhere in the CVS source? I see many of its relatives, but not the script itself. And I'm just getting started, but is there any good reason for the extern variables output_stream and output_stream_regular not to be declared in some header file? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget 1.10 alpha 1
From: Hrvoje Niksic [EMAIL PROTECTED] Also, am I missing something obvious, or should the configure script (as in, To configure Wget, run the configure script provided with the distribution.) be somewhere in the CVS source? The configure script is auto-generated and is therefore not in CVS. To get it, run autoconf. See the file README.cvs. Sorry for the stupid question. I was reading the right document but then I got distracted and failed to get back to it. Thanks for the quick, helpful responses. And I'm just getting started, but is there any good reason for the extern variables output_stream and output_stream_regular not to be declared in some header file? No good reason that I can think of. I'm busy segregating all/most of the VMS-specific stuff into a vms directory, to annoy the normal folks less. Currently, I have output_stream, output_stream_regular, and total_downloaded_bytes in (a new) main.h, but I could do something else if there's a better plan. Rather than do something similar for version_string, I just transformed version.c into version.h, which (for the moment) contains little other than: #define VERSION_STRING 1.10-alpha1_sms1 Was there any reason to do this with a source module instead of a simple macro in a simple header file? Was there any reason to use '#include config.h' instead of '#include config.h'? This hosed my original automatic dependency generation, but a work-around was easy enough. It just seemed like a difference from all the other non-system inclusions with no obvious (to me) reason. Currently, I'm working from a CVS collection taken on 11 April. Assuming I can get this stuff organized in the next few days or so, what would be the most convenient code base to use? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Build problem: ptimer.c (CVS 1.7), gcc 3.4.3, Tru64 UNIX V5.1B
urt# gcc -v Reading specs from /usr/local/lib/gcc/alpha-dec-osf5.1/3.4.3/specs Configured with: /usr1/local/gnu/gcc-3.4.3/configure Thread model: posix gcc version 3.4.3 urt# sizer -v Compaq Tru64 UNIX V5.1B (Rev. 2650); Thu Mar 6 19:03:28 CST 2003 [...] gcc -I. -I. -I/opt/include -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wg etrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c ptimer .c ptimer.c:95:20: operator '' has no left operand [...] The offending code (line 95) is: # if _POSIX_TIMERS 0 There's no left operand because: urt# grep POSIX_TIMERS /usr/include/*.h /usr/include/unistd.h:#define _POSIX_TIMERS Is there any reason that # ifdef _POSIX_TIMERS would be worse? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Build problem: ptimer.c (CVS 1.7), gcc 3.4.3, Tru64 UNIX V5.1B
# if defined(_POSIX_TIMERS) _POSIX_TIMERS 0 That's fine, if you prefer: ptimer.c:95:46: operator '' has no right operand This doc makes it appear that the unistd.h here does not conform: http://www.opengroup.org/onlinepubs/009695399/basedefs/unistd.h.html I fear that the test must be beaten into something much uglier, and I have not yet thought of anything good to do the job. SMS.
Re: Build problem: ptimer.c (CVS 1.7), gcc 3.4.3, Tru64 UNIX V5.1B
I suppose we should then use: #ifdef _POSIX_TIMERS # if _POSIX_TIMERS 0 Doesn't help. It's defined, but null. Mr, Jones is probably close to the right answer with: #if _POSIX_TIMERS - 0 0 I was looking for a way to make null look like positive, but a little more reading (http://www.opengroup.org/onlinepubs/009695399/basedefs/unistd.h.html;) suggests that zero is about as reasonable as anything: If a symbolic constant is defined with the value -1, the option is not supported. Headers, data types, and function interfaces required only for the option need not be supplied. An application that attempts to use anything associated only with the option is considered to be requiring an extension. If a symbolic constant is defined with a value greater than zero, the option shall always be supported when the application is executed. All headers, data types, and functions shall be present and shall operate as specified. If a symbolic constant is defined with the value zero, all headers, data types, and functions shall be present. The application can check at runtime to see whether the option is supported by calling fpathconf(), pathconf(), or sysconf() with the indicated name parameter. Pending a good counter argument, the best way out may be: # if defined(_POSIX_TIMERS) (_POSIX_TIMERS - 0 = 0) Perhaps with a comment describing the (unknown) danger. (Then wait for the next complaint.) Everything's complicated. SMS.
Re: wget 1.10 release candidate 1
i have just released the first release candidate of wget 1.10: ftp://ftp.deepspace6.net/pub/ds6/sources/wget/wget-1.10-rc1.tar.gz ftp://ftp.deepspace6.net/pub/ds6/sources/wget/wget-1.10-rc1.tar.bz2 you are encouraged to download the tarballs, test if the code works properly and report any bug you find. The VMS changes seem to be missing. But you probably knew that. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget and ASCII mode
From: Kiran Atlluri [...] I am trying to retrieve a ?.csv? file on a unix system using wget (ftp mode).I When I retrieve a file using normal FTP and specify ASCII mode, I successfully get the file and there are no ? ^ M ? at the end of line in this file. But when I use wget all the lines in the file have this ? ^M ? at the end. [...] This happens because write_data() (in src/retr.c) does nothing to adjust the FTP-standard CR-LF line endings according to the local standard (in this case, LF-only), which a proper FTP client should do. A fix for this was included among my recent (well, not _very_ recent now) VMS-related patch submissions, but it would probably be a mistake to hold your breath waiting for those changes to be incorporated into the main code stream. If you're desperate to see what I did to fix this, you could visit: http://antinode.org/ftp/wget/patch1/ ftp://antinode.org/wget/patch1/ A quick search for the (new) enum value rb_ftp_ascii suggests that the relevant changes are in ftp.c, retr.c, and retr.h. Feel free to get in touch if you have any questions about what you find there. (The new code does make one potentially risky assumption, but it's explained in the comments.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget and ASCII mode
[...] (The new code does make one potentially risky assumption, but it's explained in the comments.) The latest code in my patches and in my new 1.9.1d kit (for VMS, primarily, but not exclusively) removes the potentially risky assumption (CR and LF in the same buffer), so it should be swell. I've left it for someone else to activate the conditional code which would restore CR-LF line endings on systems where that's preferred. It does seem a bit odd that no one has noticed this fundamental problem until now, but then I missed it, too. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget and ASCII mode
from Hrvoje Niksic: [...] Unfortunately EOL conversions break automatic downloads resumption (REST in FTP), Could be true. manual resumption (wget -c), Could be true. (I never use wget -c.) break timestamping, How so? and probably would break checksums if we added them. You don't have them, and anyone who would be surprised by this should be directed to the note in the documentation which would explain why. Most Wget's users seem to want byte-by-byte copies, because I don't remember a single bug report about the lack of ASCII conversions. You mean other than the one from the fellow who started this thread? The one thing that is surely wrong about my approach is the ';type=a' option, which should either be removed or come with a big fat warning that it *doesn't* implement the required conversion to native EOL convention and that it's provided for the sake of people who need text transfers and are willing to invoke dos2unix/unix2dos (or their OS equivalent) themselves. Interesting. I'd have made ;type=a work right (which I claim to have done), and then perhaps included a run-time error or documentation warning if it were mixed with incompatible options (which I haven't done). Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: strtoll() not available on HP-UX
From: Hrvoje Niksic Fri, 12 Aug 2005 09:00:34 -0700 [...] -- after all, Wget has long supported platforms with much worse standard-conformance track records. And it has long not supported others, like VMS, with better ones, although I've tried to do what I could. (At least VMS has strtoll().) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: with recursive wget status code does not reflect success/failure of operation
From: Mauro Tortonesi [EMAIL PROTECTED] Ideally, the values used could be defined in some central location, allowing convenient replacement with suitable VMS-specific values when the time comes. (Naturally, _all_ exit() calls and/or return statements should use one of the pre-defined values.) mmh, i don't understand why we should use VMS-specific values in wget. On VMS (not elsewhere), Wget should use VMS-specific values. The VMS C RTL is willing to convert 0 into a generic success code, but 1 (EPERM, Not owner) and 2 (ENOENT, No such file or directory) would tend to confuse the users (and the rest of the OS). Having the exit codes defined in a central location would make it easy to adapt them as needed. Having to search the code for every instance of return 1 or exit(2) would make it too complicated. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
wget 1.10.2 released
A kit for Wget 1.10.2a for VMS is available in the usual places: http://antinode.org/dec/sw/wget.html http://antinode.org/ftp/wget/wget-1_10_2a_vms/ ftp://antinode.org/wget/wget-1_10_2a_vms/ As usual, the Zip-archive kit there includes Alpha, IA64, and VAX binaries, and the source should still be good on non-VMS systems. (Better, if you're trying to access a VMS FTP server.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: RFE: gethostbyname bypass
Is this anything like the recent inquiry, Wget Feature request IP address override.? http://www.mail-archive.com/wget@sunsite.dk/msg08340.html Maybe this should make it into the FAQ, people ask for it quite often. Perhaps so. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Error connecting to target server
It works fine from here (209.98.249.184, Wget 1.10.2a1, VMS Alpha V7.3-2). If it hangs for you, it could be that firewall. It's easy enough to block port 80 and pass ping. Does any browser work? I suspect not. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: retr.c:292: calc_rate: Assertion `bytes = 0' failed.
I realise that 1.10.2 is the latest version, but Debian doesn't seem to think so :-) If you expect Wget to work with files bigger than 2GB, you'll just have to use a Wget version which works with files bigger than 2GB. 1.10.2, for example, not 1.9.1. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget BUG: ftp file retrieval
From: Hrvoje Niksic Also don't [forget to] prepend the necessary [...] $CWD to those paths. Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to those paths. As you might recall from my changes for VMS FTP servers (if you had ever looked at them), this scheme causes no end of trouble. A typical VMS FTP server reports the CWD in VMS form (for example, SYS$SYSDEVICE:[ANONYMOUS]). It may be willing to use a UNIX-like path in a CWD command (for example, CWD A/B, but it's _not_ willing to use a mix of them (for example, SYS$SYSDEVICE:[ANONYMOUS]/A/B). At a minimum, a separate CWD should be used to restore the initial directory. After that, you can do what you wish. On my server at least (HP TCPIP V5.4), GET A/B/F.X will work, but the mixed mess is unlikely to work on any VMS FTP server. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget BUG: ftp file retrieval
From: Hrvoje Niksic Prepending is already there, Yes, it certainly is, which is why I had to disable it in my code for VMS FTP servers. and adding it fixed many problems with FTP servers that log you in a non-/ working directory. Which of those problems would _not_ be fixed by my two-step CWD for a relative path? That is: 1. CWD to the string which the server reported in its initial PWD response. 2. CWD to the relative path in the URL (A/B in our current example). On a VMS server, the first path is probably pure VMS, so it works, and the second path is pure UNIX, so it also works (on all the servers I've tried, at least). As I remark in the (seldom-if-ever-read) comments in my src/ftp.c, I see no reason why this scheme would fail on any reasonable server. But I'm always open to a good argument, especially if it includes a demonstration of a good counter-example. This (in my opinion, stinking-bad) prepending code is the worst part of what makes the current (not-mine) VMS FTP server code so awful. (Running a close second is the part which discards the device name from the initial PWD response, which led to a user complaint in this forum a while back, involving an inability to specify a different device in a URL.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget BUG: ftp file retrieval
From: Hrvoje Niksic [...] On Unix-like FTP servers, the two methods would be equivalent. Right. So I resisted temptation, and kept the two-step CWD method in my code for only a VMS FTP server. My hope was that some one would look at the method, say That's a good idea, and change the if to let it be used everywhere. Of course, I'm well known to be delusional in these matters. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget output question
1. retrieve a single page That worked. 2. convert the links in the retrieved page to their full, absolute addresses. My wget -h output (Wget 1.10.2a1) says: -k, --convert-links make links in downloaded HTML point to local files. Wget 1.9.1e says: -k, --convert-links convert non-relative links to relative. Not anything about converting relative links to absolute. I don't see an option to do this automatically. 3. save the page with a file name that I specify That worked. That's two out of three. Why would you want this result? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget output question
I do get the full Internet address in the download if I use -k or --convert-links, but not if I use it with -O Ah. Right you are. Looks like a bug to me. Wget/1.10.2a1 (VMS Alpha V7.3-2) says this without -O: 08:53:42 (51.00 MB/s) - `index.html' saved [2674] Converting index.html... 0-14 Converted 1 files in 0.232 seconds. and this with -O: 08:54:06 (297.15 KB/s) - `test.html' saved [2674] test.html: file currently locked by another user [Sounds VMS-specific, yes?] Converting test.html... nothing to do. Converted 1 files in 0.039 seconds. The message from Wget 1.9.1a was less informative: 08:57:13 (297.11 KB/s) - `test.html' saved [2674] : no such file or directory Converting ... nothing to do. Converted 1 files in 0.00 seconds. Without looking at the code, I'd say that someone is calling the conversion code before closing the -O output file. As a user could specify multiple URLs with a single -O output file, it may be difficult to make this work in the same way it would without -O, so a normal download followed by a quick rename (mv) might be your best hope, at least in the short term. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Problems using -O under Windows
From: Andrea Controzzi If I do wget -k http://www.google.it -O test.html, I get this error: Unable to delete `test.html': Permission denied You might see something familiar under the topic wget output question, where a similar problem is discussed. For best results, you might also disclose your Wget version. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: characters downloaded via wget
From: Pasciak, Patrick A. When I download via wget to a Unix platform, I pull down the ?funky? characters. Is there a wget option to handle any data conversions? Download what? How? Which Wget version? Which UNIX platform? Define pull down. Define funky. What kind of data conversions? What are you talking about? The information contained in this message may be privileged and confidential and protected from disclosure. [...] If I couldn't ask a question any better than that, I'd want it kept confidential, too. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Wishlist: support the file:/// protocol
I, too, see little value in using Wget to copy files which are accessible locally, but let's say that someone wished to add this feature. Given a link like file:///a/b.c, what would be the destination for the downloaded file on the local file system? How would link conversion work? Also, if the implementation involves something as clever as 'system( cp -p /a/b.c somewhere);', please bear in mind that such code is not portable (to VMS, for example). Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget-1.10.2 compile errors
I can't check this easily, but it appears that your (unidentified) C compiler is not happy with the macro PTR_FORMAT, defined in src/wget.h: #define PTR_FORMAT(p) 2 * sizeof (void *), (unsigned long) (p) No bets, but you might try something like: #define PTR_FORMAT(p) ((int)(2 * sizeof (void *))), (unsigned long) (p) The sizeof operator is likely to produce a size_t result, which may differ from int enough to provoke a fussy compiler. Note that, as portable code goes, this isn't very. Assuming that an unsigned long is the same size (or at least as large) as a pointer is an invitation to trouble. (And using format lx as the nearby comment suggests is also likely to cause trouble when this coincidence fails.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget-1.10.2 compile errors
From: Hrvoje Niksic The code only uses that for printing the pointer's value (only used for debugging), and even so it carefully casts the pointer to unsigned long, to avoid a mismatch between pointer size and %lx. And this still assumes that unsigned long is big enough to hold a pointer, which may not be true: alp $ run SIZ_P64.EXE char = 1, int = 4, long = 4, long long = 8, void* = 8. The only trouble will be misprinted pointers. As I recall, I did not say that it was not very portable code _and_ that it was important, only that it was not very portable code. Correctly formatting a pointer (or off_t) value which may be different sizes on different systems (or with different build options) is harder than it should be. I've previously mentioned the Info-ZIP code which does a better job, but it does have more complexity and conditionality. Everything's complicated. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget from SVN: Issue with recursive downloading from http:// sites
[...] wget is the SVN version, which is located at /usr/local/bin/wget [...] [...] (/usr/bin/wget is the version of wget that ships with the distro that I run, Fedora Core 3) [...] Results from wget -V would be much more informative than knowing the path(s) to the executable(s). (Should I know what SVN is?) Adding -d to your wget commands could also be more helpful in finding a diagnosis. If one program works and one doesn't, why use the one which doesn't? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget from SVN: Issue with recursive downloading from http:// sites
Adding -d to your wget commands could also be more helpful in finding a diagnosis. Still true. GNU Wget 1.10.2b built on VMS Alpha V7.3-2 (the original wget 1.10.2 with my VMS-related and other changes) seems to work just fine on that site. You might try starting with a less up-to-the-minute source kit to see if that helps. (Although you'd like to think that such a gross problem would be detected before any such problem code had been checked in. And with that site's content, I might prefer any program which sucked down less of it, but that's neither here nor there.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget from SVN: Issue with recursive downloading from http:// sites
Your -d output suggests a defective Wget (probably because Wget/1.10+devel was still in development). A working one spews much more stuff (as it downloads much more stuff). I'd try starting with the last released source kit: http://www.gnu.org/software/wget/ http://www.gnu.org/software/wget/index.html#downloading http://ftp.gnu.org/pub/gnu/wget/ http://ftp.gnu.org/pub/gnu/wget/wget-1.10.2.tar.gz [...] What exactly does that mean? I was just complaining about the content at afolkey2.net, but, as I said, that's neither here nor there. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget -O writes empty file on failure
When Wget fetches a URL to store into a file with a URL-derived name, it can easily open the output file after it knows that the download has begun. With -O, multiple URLs are possible, and so Wget opens the file before any download is attempted. Consider: wget -O fred http://www.gnu.org/ http://www.gnu.org/nonexistent Here, one fetch works, and the other does not. Is that successful or not? Wget could probably be changed to delay opening the -O file until a download succeeds, or it could detect any output to a -O file, and do a delete-on-close if nothing is ever written to it, but it'd probably be simpler for the fellow who specifies -O to do the check himself. man wc Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: --page-requisites option
wget -V should tell us which Wget version you are using. 1.10.2 is the latest released version. http://directory.fsf.org/wget.html Adding -d to the command may generate some useful output. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: download file with latest modified date in directory
You could do something with the --no-remove-listing option, like: wget --no-remove-listing \ ftp://ftp.symantec.com/AVDEFS/norton_antivirus/xdb/fred* or: wget --no-remove-listing \ ftp://ftp.symantec.com/AVDEFS/norton_antivirus/xdb/ Then, look at the resulting .listing file (and/or index.html, depending), extract the name (and/or URL) of the file you'd like, and use it in a second Wget command. The details would depend on your OS, which I must have missed (along with the Wget version you're using). (I assume that a DCL example procedure would not be of much use to you, but that's what I'd write.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: new feature...
From: Kristian Nilssen It's not ugly to implement - I've done it. 1 line of code. [...] Not having seen that line of code, I can't say how portable it might be, but if it uses system() rather than fork(), it'd probably work on VMS. (Which should not be interpreted as a claim that the Wget developers actually care about VMS.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Configuring WGet on Solaris 2.6
I haven't done this on SunOS 5.6 (too old!), but it appears that the configure script is not finding the expected header files under your --with-ssl directory. What's in /opt/local/ssl? Around here (in the corresponding but different --with-ssl directory), there's an include/openssl subdirectory, which is packed with (links to) the header files which appear in those conftest.c compilation complaints. Also in the --with-ssl directory are a libcrypto.a and a libssl.a, which could become valuable later, if you ever get past the missing header files. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: timestamping feature with different output file name
I'm curious. Currently, -O may be used with multiple URLs on the command line. What would be the right way for this to work with -N? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Wget do not exit after the 100 percent of downloading;
I haven't seen this behavior, but adding -d to the Wget command line might tell you something about what it's doing. [...] Wget 1.5.3.1 1.10.2 is current. 1.5.3 is pretty old. http://www.gnu.org/software/wget/wget.html Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: wget seems not to respect -np
Do I do something stupid? Well, you didn't say which version of Wget (wget -v) you're using, or on which operating system you're running it. Also, you might get some helpful output if you add -d to your command. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Problem downloading 2GB files in new version ?
Content-Length: -2085613568 If the server reports a bad length, it's asking a lot to expect Wget to fix it. (It could be done, but not reliably.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Downloading large files
[...] wget.exe version 1.8.2. You might try Wget 1.10.2, which has large-file support (where the underlying operating system and C run-time library do). http://www.gnu.org/software/wget/wget.html Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Problem installing wget 1.10.2 on AIX
I haven't touched an AIX system for a _long_ while (4.1.4, as I recall), but if configure says: checking build system type... powerpc-ibm-aix5.2.0.0 === then I'd say that a move to AIX 4.3.3 would not generally be considered an upgrade. Clearly, what should be simple C compile-link-run is failing, but it's not obvious (to me) why. Basic questions would be: What is a cc command actually running? Does it need some AIX-specific options to work the way gcc does? Is there a libc.a object library somewhere (/usr/lib?), and does it contain a shr.o module? (Could not load module libc.a(shr.o). \ System error: No such file or directory sure looks suspicious to me.) Is there some alternate C compiler interface which might work better (like /usr/something/cc, instead of whatever cc is getting now)? Have you considered installing GCC? (You'd like to think that that wouldn't be needed.) Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Wget 1.10.2 bug
It seems to me that the -O option has wget touching the file which wget then detects. Close enough. With -O, Wget opens the output file before it does any transfers, so when the program gets serious about the transfer, the file will exist, and that will confuse the -nc processing. This is just one more case of -O not working well with other options. The fixes for these problems (when possible) are generally much more complicated than simply not using -O with other options when it causes trouble. Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: Fwd: Recursive FTP Fail -- unsupported file type?
It might help to know which version of Wget you're using, and on what you're using it. What's the purpose of the * in your command? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
Re: downloading https site with a username and password
It's hard to be sure without seeing the actual Web page, but if you're hittng a Submit button, then you're probably filling out a form, and so you probably need to specify the form data (such as the username and password) using the --post-data or --post-file options. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: -O switch always overwrites output file
Wget 1.10.2 is the current release, but I wouldn't expect a change in this behavior from the additional 0.0.2. The -nc option affects output files whose names are derived automatically from the URLs involved. The -nc code currently is not engaged for the user-specified -O file name, the code for which is in a different neighborhood. It may not have been a conscious decision, but that's the way it works now. Personally, I figure that if the user specifies the output file name, then it's his fault if the program overwrites his (precious) old file, but I wouldn't complain if someone added a -nc check for the -O file. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: missing files
[...] Any clue about that? Not in your posting. You might say which Wget version you're using, on which sort of system, and which files are not getting fetched, and then show the links to those files in the HTML which Wget should have followed. Without some actual information about what's happening (clues), it's not possible to say much which might be useful. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Missing K/s rate on download of 13MB file
From: Hrvoje Niksic The --.-- current download rate means that the download is currently not progressing. [...] Looking at the code in src/progress.c (version 1.10.2), it would appear that --.--K/s is emitted when either the time (hist-total_time) or the byte count (hist-total_bytes) is zero, rather than just when the time is zero. This precludes emitting a 0.0 rate, which would be more informative, in my opinion, than --.--K/s. 0/10 is quite well defined, even though 10/0 is not. When I see a rate like --.--K/s, I assume that there's not enough info to provide a real number (such as no bytes transferred), not that the value is zero. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: recursive download
From: Mauro Tortonesi [...] this is one of the pending bugs that will be fixed before the upcoming 1.11 release. At the risk of beating a dead horse yet again, is there any chance of getting the VMS changes into this upcoming 1.11 release? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: WGET -O Help
From: David David 3. Outputs the graph to ta.html (replacing original ta.html)... BAD. On VMS, where (by default) it's harder to write to an open file, the symptom is different: ta.html: file currently locked by another user But the real question is: If a Web page has links to other files, how is Wget supposed to package all that stuff into _one_ file (which _is_ what -O will do), and still make any sense out of it? It might be practical to rig a new option to put the primary URL results into one file with a user-specified name, but still handle the page-requisites in the normal way, but, as currently implemented, -O is a long way from doing that. And I agree, those certainly are ugly file names. Could you make a simple redirecting Web page on a Web (or FTP) server of your own, with a _nice_ name, and then attack that page with Wget? (Ugly, but perhaps effective.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: WGET Out of Memory Error
From: oscaruser [...] wget (1.9.1) [...] Wget version 1.10.2 is the current release. [...] Is there a way to set the persistent state to disk instead of memory [...] I believe that there's a new computing concept called virtual memory which would handle this sort of thing automatically. How much swap space do you have available? How much free disk space do you have? How do you turn one into the other on your OS? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: WGET -O Help
From: Mauro Tortonesi [EMAIL PROTECTED] perhaps we should make this clear in the manpage Always a good idea. and provide an additional option which just renames saved files after download and postprocessing according to a given pattern. IIRC, hrvoje was just suggesting to do this some time ago. what do you guys think? Sounds like a good thing to work on right after the VMS-related changes have been added. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: BUG: wget with option -O creates empty files even if the remote file does not exist
From: Eduardo M KALINOWSKI wget http://www.somehost.com/nonexistant.html -O localfile.html then file localfile.html will always be created, and will have length of zero even if the remote file does not exist. Because with -O, Wget opens the output file before it does any network activity, and after it's done, it closes the file and leaves it there, regardless of its content (or lack of content). You could avoid -O, and rename the file after the Wget command. You could keep the -O, and check the status of the Wget command (and/or check the output file size), and delete the file if it's no good. (And probably many other things, as well.) If you look through http://www.mail-archive.com/wget@sunsite.dk/;, you can find many people who think that -O should do something else, but (for now) it does what it does. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Problem with recursion and standard output
As always, it might help to see which version of Wget, which Wget command was used, what the actual output was, and which operating system was used. However, ... You're right. And the most likely fix will be to add an error message telling you that -O and -r (and several other options) are incompatible. You might review some of the other recent -O complaints at http://www.mail-archive.com/wget@sunsite.dk/;, and/or consider that, because Wget does recursion by looking for links in the files it downloads, you'd be asking the program to be reading and writing to the same file at the same time, which, while not necessarily impossible, would require a significantly different method of operation. What would be the value of the mess which would result from such a Wget command if it _did_ work? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.11 alpha 1 released
From: Mauro Tortonesi ftp://alpha.gnu.org/pub/pub/gnu/wget/wget-1.11-alpha-1.tar.gz I assume that it would be pointless to look for the VMS changes here, but feel free to amaze me. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.11 alpha 1 released
First, a bit of history... From: Steven M. Schweda 15-DEC-2004 14:19:07.55 [...] http://www.antinode.org/dec/sw/wget.html [...] From Mauro Tortonesi Sun, 10 Apr 2005 23:21:00 -0500 [...] if you want your patches to be merged in our CVS, you should follow the official patch submission procedure (that is, posting your patches to the wget-patches AT sunsite DOT dk mailing list. each post should include a brief comment about what the patch does, and especially why it does so). this would save a lot of time to me and hrvoje and would definitely speed up the merging process. [...] From: Steven M. Schweda Mon, 18 Apr 2005 12:21:44 -0500 (CDT) [...] http://antinode.org/ftp/wget/patch1/ [...] From Mauro Tortonesi Mon, 19 Sep 2005 17:45:14 +0200 [...] the wget code is going through a major refactoring effort. later on, just before releasing wget 2.0, i promise i will re-evaluate your patches and merge them if they're not too intrusive. [...] From Mauro Tortonesi Tue, 13 Jun 2006 09:38:36 -0700 [...] i promise we'll seriously talk about merging your VMS changes into wget at the beginning of the 1.12 development cycle. [...] That would be nice, as it would have been nice every other time I've suggested it since Wget 1.9.1 in December 2004. you'll be very welcome to convince me about the soundness of your code and the need to merge VMS support into wget [...] Need? None at all, if you have no interest in providing any support for Wget on VMS, and if you have no interest in Wget working well with a VMS FTP server, and if you have no interest in the miscellaneous bug fixes I've made along the way. I simply assumed, as I had done all the necessary work for VMS support (client and server), and fixed a few bugs along the way, that you might find it worth the (pretty small) effort to incorporate my suggested changes. On the topic of the soundness of code, let's consider what happens on a Tru64 system fetching files from a VMS FTP server using the new Wget 1.11-alpha-1, the original Wget 1.10.2, and my Wget 1.10.2b. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - urtx# wg11 -V GNU Wget 1.11-alpha-1 [...] urtx# wg11 -r ftp://alp/wget_test/ --22:15:11-- ftp://alp/wget_test/ = `alp/wget_test/.listing' Resolving alp... 10.0.0.9 Connecting to alp|10.0.0.9|:21... connected. Logging in as anonymous ... Logged in! == SYST ... done.== PWD ... done. == TYPE I ... done. == CWD [ANONYMOUS.wget_test] ... done. == PASV ... done.== LIST ... done. [ =] 32 --.-K/s in 0s 22:15:12 (640 B/s) - `alp/wget_test/.listing' saved [32] Removed `alp/wget_test/.listing'. Wrote HTML-ized index to `alp/wget_test/index.html' [198]. urtx# urtx# cat ./alp/wget_test/index.html !DOCTYPE HTML PUBLIC -//IETF//DTD HTML 2.0//EN html head titleIndex of /wget_test on alp:21/title /head body h1Index of /wget_test on alp:21/h1 hr pre /pre /body /html Please observe that no files were downloaded. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - urtx# wg10 -V GNU Wget 1.10.2 [...] urtx# wg10 -r ftp://alp/wget_test/ --22:11:58-- ftp://alp/wget_test/ = `alp/wget_test/.listing' Resolving alp... 10.0.0.9 Connecting to alp|10.0.0.9|:21... connected. Logging in as anonymous ... Logged in! == SYST ... done.== PWD ... done. == TYPE I ... done. == CWD [ANONYMOUS.wget_test] ... done. == PASV ... done.== LIST ... done. [ = ] 284 --.--K/s 22:11:58 (28.40 KB/s) - `alp/wget_test/.listing' saved [284] Removed `alp/wget_test/.listing'. --22:11:58-- ftp://alp/wget_test/EMPTY/ = `alp/wget_test/EMPTY/.listing' == CWD [ANONYMOUS.wget_test.EMPTY] ... done. == PASV ... done.== LIST ... done. [ = ] 32--.--K/s 22:11:58 (2.91 KB/s) - `alp/wget_test/EMPTY/.listing' saved [32] Removed `alp/wget_test/EMPTY/.listing'. --22:11:58-- ftp://alp/wget_test/EMPTY/ = `alp/wget_test/EMPTY/index.html' == CWD not required. == PASV ... done.== RETR ... No such file `'. --22:11:58-- ftp://alp/wget_test/NON-EMPTY/ = `alp/wget_test/NON-EMPTY/.listing' == CWD [ANONYMOUS.wget_test.NON-EMPTY] ... done. == PASV ... done.== LIST ... done. [ = ] 195 --.--K/s 22:11:58 (16.25 KB/s) - `alp/wget_test/NON-EMPTY/.listing' saved [195] Removed `alp/wget_test/NON-EMPTY/.listing'. --22:11:58-- ftp://alp/wget_test/NON-EMPTY/A.TXT = `alp/wget_test/NON-EMPTY/A.TXT' == CWD [ANONYMOUS.wget_test.NON-EMPTY] ... done. == PASV ... done.== RETR A.TXT ... done. Length: 6 (unauthoritative) 100%[] 6 --.--K/s 22:11:58 (117.19 KB/s) - `alp/wget_test/NON-EMPTY/A.TXT' saved [6] FINISHED --22:11:58-- Downloaded: 6 bytes in 1 files urtx# Please observe the spurious
Re: concurrent use of -O and -N options
From: Mauro Tortonesi Louis Gosselin (included in CC) asked me to reconsider my decision, as he believes the concurrent use of -O and -N options is actually very helpful. I thought that he found it useful in Wget 1.8.1, where it apparently worked differently. The way it works _now_ (1.10.2), I see no value to allowing -N with -O. But I _do_ see value in VMS support, so my opinion may not be worth much. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget file with wild cards
From: Bud Hinson Does anyone know how to get wget to use wild cards i.e. * and ? in a file of URL's? If you're using FTP, T thought that it was supposed to work. If you're trying to do it with HTTP, you're probably doomed, as the server probably won't do it for you, and, unlike FTP, HTTP does not naturally allow you to get a list of files so you could do it yourself. As usual, it might help to know your Wget version and operating system, and an example of what you'd like to work could also be useful. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.11 beta 1 released
you're very welcome to try it and report every bug you might encounter. Same failure (as wget 1.11 alpha 1) to fetch any files with wget -r ftp://xxx; from a VMS FTP server. See: http://www.mail-archive.com/wget@sunsite.dk/msg09074.html Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget silently overwrites a file when using -c and the server does not support resuming
From: Ori Avtalion wget -O Test.resume_me.avi [...] [...] Result: The old file will be silently overwritten. [...] You're working too hard. Using -O will overwrite the output file no matter what happens, whether the download works or not. That's what -O does. If you don't like it, don't use -O. If you look through the archive, you can find many other cases where -O caused various effects which various users did not like. It's a characteristic of -O. If you can see the same problem when you don't specify -O, feel free to re-complain. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: REST - error for files bigger than 4GB
From: Petr Kras [...] == PORT ... done.== REST 4998699942 ... REST failed, starting from scratch. [...] Something more might be learned from adding -d to your Wget command line. I don't use this continuation feature, but a quick look at the code suggests that Wget 1.10.2 is using 64-bit integers for file sizes/offsets if it's built with large-file capability, so it's not obviously defective. (If it can _say_ 4998699942, then it should be ok.) Are you certain that the FTP _server_ can handle file offsets greater than 4GB in the REST command? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget how do I do...
From: Craig A. Finseth It might help to know which version of Wget you're using (wget -V), and on which system type you're running it. Adding -d to the wget command line might give you more clues as to what it's trying to do. Seeing the debug output might save considerable code tracing, as I, for example, don't have access (so far as I know) to an FTP server which acts that way. Probably useless guesswork: Does it help to add a trailing / to the URL (ftp://...:...@site/myDir/)? Same behavior with -r? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget how do I do...
From: Craig A. Finseth [EMAIL PROTECTED] 1.9 going from Solaris 9 to a Windows server. Note that 1.10.2 is the current released version. http://www.gnu.org/software/wget/wget.html In fact, adding the trailing / solved the problem completely. I had hope, just not confidence. (Watching the -d output, even on a VMS system talking to a VMS system, was suggestive, however.) SMS.
Re: wget 1.10.1 segfaults after SYST
From: kurt . degrave I know I have to upgrade to 1.10.2, but Novell didn't release a patch yet. This may be a stupid question, but what does Novell have to do with wget on SUSE LINUX? Not knowing why not, I'd still say try wget 1.10.2. To which type of FTP server are you talking? Add -d to your wget command line, and see if the output is informative. Using a simple FTP client, connect to that server, send it a SYST command (typically by saying quote SYST, but it depends on the FTP client), and report the result. I can't give a core dump [...] A traceback could be useful. (dbx .../wget core, where?) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.10.1 segfaults after SYST
From: Ryan Barrett novell bought SUSE a while ago. they now build and sell the SUSE distro, so they're responsible for support and patches. Ach so. (I have enough trouble keeping track of who's selling VMS these days. Linux vendors escape me pretty completely.) Still, there's no obvious reason not to try the current released version. It can be built and used to run a test without replacing the official one. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.10.1 segfaults after SYST
From: Kurt De Grave I prefer not to run unsigned software whenever possible. In that case, I think that you're doomed. (Unless you trust yourself to change the code, that is.) However, I have a SUSE 10.1 system around, with wget 1.10.2, and it segfaults as well. Ok. To which type of FTP server are you talking? I believe it is a Windows 2000 Server. It is definitely Windows. This seems to be the key. == SYST ... -- SYST 215 Lukemftp responds to SYST with: 215 The wget code was expecting a longer response than that. For example (antinode.org): 200 VMS OpenVMS V7.3 on node alp.antinode.org. or (ftp.hp.com): 215 UNIX Type: L8 That is, a 2xx code followed by some text describing the FTP server type. A quick look at src/ftp-basic.c suggests that the problem may lie here: [...] /* Skip the number (215, but 200 (!!!) in case of VMS) */ strtok (respline, ); /* Which system type has been reported (we are interested just in the first word of the server response)? */ request = strtok (NULL, ); if (!strcasecmp (request, VMS)) *server_type = ST_VMS; [...] With no text after the 215 in this FTP server's response, request is probably NULL, causing the first strcasecmp() to explode. It might work a little better if it looked like this: [...] /* Skip the number (215, but 200 (!!!) in case of VMS) */ strtok (respline, ); /* Which system type has been reported (we are interested just in the first word of the server response)? */ request = strtok (NULL, ); if (request == NULL) *server_type = ST_OTHER; else if (!strcasecmp (request, VMS)) *server_type = ST_VMS; [...] This code doesn't seem to be any different/better in wget 1.11-alpha-1, so it may not be any different/better yet in the main development code, either. Of course, because this FTP server doesn't actually identify itself as anything in particular, you can still expect to see a complaint from wget like: Unsupported listing type, trying Unix listing parser. And if the directory listing format is _not_ UNIX-like, then the whole thing may fail with confusing symptoms. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.10.1 segfaults after SYST
From: Kurt De Grave Lukemftp responds to SYST with: 215 I'm a little slow. I see now that Lukemftp is your FTP _client_ program. So, it's the (still mysterious) FTP _server_ which responds with the bare 215. (The FTP client just shows you what the server's response was.) Using the simple FTP client (Lukemftp), does the FTP server say anything self-identifying in its greeting when you first connect, or when you get logged in? The reason wget explodes this way is that no one who tested it ever ran into an FTP server so lame as this one, and hence did not anticipate a successful (2xx) but empty response to a SYST inquiry. It makes sense to change wget to avoid the failure, but it may make even more sense to switch to a better FTP server. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: I have seen something strange
From: Miquel Serra Llobera I have seen something strange, this i what happened: It would probably help to know which wget version you were using (wget -V). There could be a bad free() in the code which only happens when there's an error like this one. Reproducing this may not be very easy. If you can do it, you could Try setting environment variable MallocHelp to see tools to help debug. Knowing exactly where tha bad free() was could help much. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Parallel downloads provide significant speedup
From: Axel Boldt [...] parallel downloading provides considerable speedups in almost all settings. Perhaps in all almost _your_ settings. Not around here: DSL (Downstream Data Rate: 1536 Kbps), COMPAQ Professional Workstation XP1000 running OpenVMS V7.3-2, GNU Wget 1.10.2b built on VMS Alpha V7.3-2. (It's a single, 500MHz Alpha (EV6) CPU system, with wide Ultra SCSI disks, circa 1999.) You can do your own experiments simply enough [...] That's true. Looking at elapsed times, for two downloads in parallel (the same linux-2.6.17.9.tar.gz and linux-2.6.17.8.tar.gz): ET[1] = 620s, ET[2] = 628s, which are rates of about 83kB/s individually, about 166kb/s aggregated. For a single download (linux-2.6.17.10.tar.gz): ET = 322s, which is a rate of about 161kB/s. As a figure like 160kB/s is about the fastest I ever see in any context, I'm pretty confident that my bottleneck is the network. For only a few percent gain, I'd prefer that Mr. Tortonesi put his time and effort into integrating the changes needed for VMS, rather than persuing parallel downloads. (Clearly, one will benefit me much more than the other.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.10.1 segfaults after SYST - solved
From: Kurt De Grave if (request == NULL) *server_type = ST_OTHER; else if (!strcasecmp (request, VMS)) It works perfectly now. What could go wrong? One might consider committing this to the trunk. I have no idea if there are other stealthy FTP servers out there, but the FTP service in question is from a relatively large hosting provider. The wget maintainer seems to have a phobia involving changes I suggest, but this one probably has a better chance than anything VMS-related. It'll be in my next VMS-compatible kit, anyway. I'd probably complain to the FTP provider, too, but I'm a chronic complainer. Thanks! Thanks for the report. Glad to help. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Question / Suggestion for wget
From: Mitch Silverstein If -O output file and -N are both specified [...] When -O foo is specified, it's not a suggestion for a file name to be used later if needed. Instead, wget opens the output file (foo) before it does anything else. Thus, it's always a newly created file, and hence tends to be newer than any any file existing on any server (whose date-time is set correctly). -O has its uses, but it makes no sense to combine it with -N. Remember, too, that wget allows more than one URL to be specified on a command line, so multiple URLs may be associated with a single -O output file. What sense does -N make then? It might make some sense to create some positional option which would allow a URL-specific output file, like, say, -OO, to be used so: wget http://a.b.c/d.e -OO not_dd.e http://g.h.i/j.k -OO not_j.k but I don't know if the existing command-line parser could handle that. Alternatively, some other notation could be adopted, like, say, file=URL, to be used so: wget not_dd.e=http://a.b.c/d.e not_j.k=http://g.h.i/j.k But that's not what -O does, and that's why you're (or your expectations are) doomed. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Im not sure this is a bug or feature... (2GB limit?)
From: Tima Dronenko Im not sure this is a bug or feature... wget -V If your wget version is before 1.10, it's a feature. At or after 1.10, it's a bug. (In some cases, the bug is in the server.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: new wget bug when doing incremental backup of very large site
1. It would help to know the wget version (wget -V). 2. It might help to see some output when you add -d to the wget command line. (One existing file should be enough.) It's not immediately clear whose fault the 416 error is. It might also help to know which Web server is running on the server, and how big the file is which you're trying to re-fetch. This was surprising [...] You're easily surprised. wget: realloc: Failed to allocate 536870912 bytes; memory exhausted. 500MB sounds to me like a lot. [...] it exhausts the memory on my test box which has 2GB. A memory exhausted complaint here probably refers to virtual memory, not physical memory. [...] I do not want it to check to see if the file is newer, if the file is complete, just skip it and go on to the next file. I haven't checked the code, but with continue=on, I'd expect wget to check the size and date together, and not download any real data if the size checks, and the local file date is later. The 416 error suggests that it's trying to do a partial (byte-range) download, and is failing because either it's sending a bad byte range, or the server is misinterpreting a good byte range. Adding -d should show what wget thinks that it's sending. Knowing that and the actual file size might show a problem. If the -d output looks reasonable, the fault may lie with the server, and an actual URL may be needed to persue the diagnosis from there. The memory allocation failure could be a bug, but finding it could be difficult. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: new wget bug when doing incremental backup of very large site
From dev: I checked and the .wgetrc file has continue=on. Is there any way to surpress the sending of getting by byte range? I will read through the email and see if I can gather some more information that may be needed. Remove continue=on from .wgetrc? Consider: -N, --timestampingdon't re-retrieve files unless newer than local. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: BUG - .listing has sprung into existence
From: Sebastian Doctor, it hurts when I do this. Don't do that. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget 1.10.1 segfaults after SYST
From: kaneda [...] == SYST ... Segmentation fault (core dumped) [...] This sounds like the same problem as the one under wget 1.10.1 segfaults after SYST. For details and the solution(s), try the thread beginning at: http://www.mail-archive.com/wget@sunsite.dk/msg09371.html It _was_ nice to see a problem report with some useful info (wget version, host OS, et c.) for a change. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: linux version crashes when reaching the max size limit
From: Toni Casueps i. it crashed is not a helpful description of what happened. What actually happened? 2. If the file is too large for a FAT32 file system, what would you like to happen? 4294967295 looks like 2^32-1, which (from what I've read) is the maximum size of a file on a FAT32 file system. 3. Wget 1.10.2 is the latest released version. Complaints about older versions normally lead to a suggestion to try the latest version. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Documentation error?
From: Ian As usual, it might help to know which wget version you're using (wget -V) and on which system type you're using it. The documentation section 7.2 states: _Which_ documentation section 7.2? wget -r -l1 --no-parent -A.gif http://www.server.com/dir/ I don't normally use -A, but a Google search for wget -A found this: http://www.gnu.org/software/wget/manual/html_node/Types-of-Files.html which suggests that -A gif might work better than -A.gif. Adding -d to the wget command might also be informative. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Accents in PHP parameter
14:13:04 ERROR 406: Not Acceptable. It looks to me as if the Web server does not like these characters. Adding -d to the wget command might tell you more about what wget is doing. Do you have any evidence of a URL like this which works in, say, a Web browser? GNU Wget 1.7 1.10.2 is the latest released version. If there is a problem with wget 1.7, _and_ if it's still a problem in 1.10.2, then someone might wish to work on it. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: hacking 'prefix'
I give up. What are you doing, what are you doing it with, what are you doing it on, what happens, and what would you like to have happen instead? (Hint: Actual commands and their output would help more than vague descriptions.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: trouble loading and installing wget
From: Siddiqui, Kashif I'm trying to install wget on my itanium 11.23 system [...] I assume that that's HP-UX 11.23, as in: [EMAIL PROTECTED] uname -a HP-UX td176 B.11.23 U ia64 1928826293 unlimited-user license /usr/lib/hpux32/dld.so: Unsatisfied code symbol '__umodsi3' in load module '/usr/local/bin/wget'. And where did you get _that_ copy of wget? If I use the source code and run the configure script, then do a 'make install' I get the following error: [...] gcc -I. -I. -O -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c connect.c In file included from connect.c:41: /usr/include/sys/socket.h:535: error: static declaration of 'sendfile' follows non-static declaration [...] Complaints about header files are often caused by a bad GCC installation (or an OS upgrade which confuses GCC). I just tried building my VMS-oriented 1.10.2c kit using GCC on one of the HP TestDrive systems, and I had some trouble ('ld: Unsatisfied symbol libintl_gettext in file getopt.o'), but that's much later than compiling connect.c, which got only the (usual) warnings about the pointers. That's with: http://antinode.org/dec/sw/wget.html http://antinode.org/ftp/wget/wget-1_10_2c_vms/wget-1_10_2c_vms.zip [EMAIL PROTECTED] gcc --version gcc (GCC) 3.4.3 [...] And I have no idea whether the GCC installation there is good or bad. (But it seems to be better than yours.) I also tried it using HP's C compiler (CC=cc ./configure): [EMAIL PROTECTED] cc -V cc: HP C/aC++ B3910B A.06.12 [Aug 17 2006] Here, the make ran to an apparently successful completion, but real testing is not convenient on the TestDrive systems, so I can't say whether it would actually work better than what you have. [EMAIL PROTECTED] ./src/wget -V GNU Wget 1.10.2c built on hpux11.23. [...] So, I'd suggest using HP's C compiler, or else re-installing GCC. After that, I'd suggest using the ITRC HP-UX forum: http://forums1.itrc.hp.com/service/forums/familyhome.do?familyId=117 Any idea's and assistance [...] That's ideas, by the way. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: FTP SYST NULL dereferencing crash (found by someone else)
From: Ulf Harnhammar [EMAIL PROTECTED] + if (request == NULL) +{ + xfree (respline); + return FTPSRVERR; +} Well, yeah, if you prefer returning an error code to trying a little harder. I prefer my change: if (request == NULL) *server_type = ST_OTHER; Why punish the user when the FTP server behaves badly? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Wget timestamping is flawed across timezones
From: Remko Scharroo: Can this be fixed? Of course it can be fixed, but someone will need to fix it, which would involve defining the user interface and adding the code to do the actual time offset. I assume that the user will need to specify the offset. For an indication of what could be done, you might look for WGET_TIMEZONE_DIFFERENTIAL in my VMS-adapted src/ftp-ls.c: ftp_parse_vms_ls(). http://antinode.org/dec/sw/wget.html This is a common problem on VMS systems, which normally (sadly), use local time instead of, say, UTC. One result of this is that FTP servers on VMS tend to provide file date-times in the server's local time. I chose to add an environment variable (a VMS logical name on a VMS system) as the user interface for code simplicity (less work for me), and partly because VMS uses a similar logical name (SYS$TIMEZONE_DIFFERENTIAL) to specify the offset from UTC to local time, so the concept would already be familiar to a VMS user. I use WGET_TIMEZONE_DIFFERENTIAL in the code only for a VMS FTP server, but I assume that it could easily be adapted to the other ftp_parse*_ls() functions. (Or a new command-line option could be used to specify the offset.) When I did the work, I probably didn't consider the possibility that any non-VMS FTP servers would provide file date-times in non-UTC. Otherwise I might have made it more general. Trying to get my VMS-related changes into the main Wget development stream has been sufficiently unsuccessful that I don't spend much time working on adding features and fixes which are not trivially easy and which I don't actually need myself. But I wouldn't try to discourage anyone else. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: problem at 4 gigabyte mark downloading wikipedia database file.
From: Jonathan Bazemore: [...] I am using wget 1.9 [...] up to about the 4 gig mark [...] Try the current version of wget, 1.10.2, which offers large-file support on many systems, possibly including your unspecified one. http://www.gnu.org/software/wget/wget.html Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: re: 4 gig ceiling on wget download of wiki database. Wikipedia database being blocked?
From: Jonathan Bazemore: I've repeatedly tried [...] If it's still true that you're using wget 1.9, you can probably try until doomsday with little chance of success. Wget 1.9 does not support large files. Wget 1.10.2 does support large files. Try the current version of wget, 1.10.2, which offers large-file support on many systems, possibly including your unspecified one. Still my advice. In the future, it might help if you would supply some useful information, like the wget version you're using, and the system type you're using it on. Also, actual commands used and actual output which results would be more useful than vague descriptions like consistently breaking and will not resume. I've used a file splitting program to break the partially downloaded database file into smaller parts of differing size. Here are my results: [...] So, what, you're messing with the partially downloaded file, and you expect wget to figure out what to do? Good luck. [...] wget (to my knowledge) doesn't do error checking in the file itself, it just checks remote and local file sizes and does a difference comparison, downloading the remainder if the file size is smaller on the client side. Only if it can cope with a number as big as the size of the file. Wget 1.9 uses 32-bit integers for file size, and that's not enough bits for numbers over 4G. And if you start breaking up the partially downloaded file, what's it supposed to use for the size of the data already downloaded? Wikipedia doesn't have tech support, [...] Perhaps because they'd get too many questions like this one too many times. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Issue/query with the Wget
From: Manish Gupta Issue: when i pass a 300 MB file to wget in one shot, it willl not able to download the file at the client side. Is this _really_ a problem, or are you only afraid that it might be a problem? 300MB is not a large file. 2GB (or, sometimes, 4GB) is a large file. The latest released wget version (1.10.2) should work with large files on systems which support large files. Do wget has the feature of buffer where it is holding the stream, if it there then by increasing or specifying th buffer limit, i think we can overcome the issue. Wget writes the data to a file. If you have the disk space, it should work. People often use wget to download CD and DVD image files. Some older wget versions (without large-file support) had some problems with files bigger than 2GB (or 4GB, depending on the OS), but not version 1.10.2. Some _servers_ have problems with large files, but those are not wget problems. As usual, it would help to know which version of wget you're using, on which host system type you're using it, and the OS version there. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: SI units
From: Lars Hamren Download speeds are reported as K/s, where, I assume, K is short for kilobytes. The correct SI prefix for thousand is k, not K: http://physics.nist.gov/cuu/Units/prefixes.html To gain some insight on this, try a Google search for: k 1024 I've seen contrary comments from people who apparently know no actual science, and who think that they know somthing about computers, claiming that 1000 is wrong, and that only 1024 is legitimate for k or K. You have my best wishes in your quest to set the world straight on this one. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Possibly bug
From: Yuriy Padlyak Have been downloading slackware-11.0-install-dvd.iso, but It seems wget downloaded more then filesize and I found: -445900K .. .. .. .. ..119% 18.53 KB/s in wget-log. As usual, it would help if you provided some basic information. Which wget version (wget -V)? On which system type? OS and version? Guesswork follows. Wget versions before 1.10 did not support large files, and a DVD image could easily exceed 2GB. Negative file sizes are a common symptom when using a small-file program with large files. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Downloading multiple pages
From: graham hadgraft I need some help using an application [...] You seem to need some help asking for help. wget -r -l2 -A html -X cgi-bin -D www.somewebsite.co.uk/ -P /home/httpd/vhosts/somewebsite.co.uk/catalogs/somewebsite/swish_site/ http://www.somewebsite.co.uk/questions/ This only index the index page of this folder. It wil not follow the links on the page. What would be the appropriate command to use to index all pages from that folder. Did it occur to you that it might matter which version of wget you're using, and on which system type (and version)? Or that it might be difficult for someone else to guess what happens when no one else can see the Web page which seems to be causing your trouble? Does it actually have links to other pages? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Newbie Question - DNS Failure
From: Terry Babbey I installed wget on a HP-UX box using the depot package. Great. Which depot package? (Anyone can make a depot package.) Which wget version (wget -V)? Built how? Running on which HP-UX system type? OS version? Resolving www.lambton.on.ca... failed: host nor service provided, or not known. First guess: You have a DNS problem, not a wget problem. Can any other program on the system (Web browser, nslookup, ...) resolve names any better? Second guess: If DNS works for everyone else, I'd try building wget (preferably a current version, 1.10.2) from the source, and see if that makes any difference. (Who knows what name resolver is linked in with the program in the depot?) Third guess: Try the ITRC forum for HP-UX, but you'll probably need more info than this there, too: http://forums1.itrc.hp.com/service/forums/familyhome.do?familyId=117 Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: Newbie Question - DNS Failure
From: Terry Babbey Built how? Installed using swinstall How the depot contents were built probably matters more. Second guess: If DNS works for everyone else, I'd try building wget (preferably a current version, 1.10.2) from the source, and see if that makes any difference. [...] Started to try that and got some error messages during the build. I may need to re-investigate. As usual, it might help if you showed what you did, and what happened when you did it. Data like which compiler (and version) could also be useful. On an HP-UX 11.23 Itanium system, starting with my VMS-compatible kit (http://antinode.org/dec/sw/wget.html;, which shouldn't matter much here), I seemed to have no problems building using the HP C compiler, other than getting a bunch of warnings related to socket stuff, which seem to be harmless. (Built using CC=cc ./configure and make.) td176 cc -V cc: HP C/aC++ B3910B A.06.13 [Nov 27 2006] And I see no obvious name resolution problems: td176 ./wget http://www.lambton.on.ca --23:42:04-- http://www.lambton.on.ca/ = `index.html' Resolving www.lambton.on.ca... 192.139.190.140 Connecting to www.lambton.on.ca|192.139.190.140|:80... failed: Connection refuse d. d176 ./wget -V GNU Wget 1.10.2c built on hpux11.23. [...] That's on an HP TestDrive system, which is behind a restrictive firewall, which, I assume, explains the connection problem. (At least it got an IP address for the name.) And it's not the same OS version, and who knows which patches have been applied to either system?, and so on. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: wget error report
From: Daniele Annesi I think it is a Bug: using wget for multiple files : es. wget ftp://user:[EMAIL PROTECTED]/*.zip in the time of each file the seconds are set to 00 That's not an error report. An error report would tell the reader which version of wget you were using (wget -v), on which system type you were using it, and the OS version, at least. It would also help to know how the FTP server reports the date-times in its listings, as that's where wget gets the information. If the server doesn't provide the seconds, how can wget set them? (And of course, without more information we can't see the date-time data for ourselves.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: is there any plan about supporting different charsets?
From: Leo Jay since the responds of ftp server could be in different charsets, and wget can't cope with charsets other than English, i'd like to know is there any plan about supporting different charsets? Are you complaining about dates in different languages, or file names in different character sets? Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: is there any plan about supporting different charsets?
From: Leo Jay since the responds of ftp server could be in different charsets, and wget can't cope with charsets other than English, i'd like to know is there any plan about supporting different charsets? Are you complaining about dates in different languages, or file names in different character sets? i'm talking about dates in different languages. i haven't tried file names in different charsets, but i'm sure wget can't cope with dates in different languages. If you look in src/ftp-ls.c: ftp_parse_unix_ls(), you should find an array of month names: static const char *months[] = { Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec }; If by dates in different languages you mean that non-English month names are the only problem, then it should be fairly easy to extend this with month names in other languages, and then change the code below (if (i != 12), month = i;) to something a litle more complex, to handle the new possibilities. If the order of the tokens also changes, then you may need to dive into the hideously complex parsing code, and make it even more hideously complex. (The fellow who designed the date format(s) for ls was obviously targeting an intelligent human audience, not another computer program. The order and simplicity of a VMS DIRECTORY listing shows some evidence of actual design, and parsing such a listing is relatively trivial, but that won't help you any.) I might offer a few more details, but your specification of the problem is not complete enough to make that practical. If you can list a set of date forms which must be interpreted, then it might be possible to say how hard it would be to do the job. (I assume that there is no actual ambiguity in the month name strings for the languages you would like to support, but that could make the problem impossible to solve for some languages.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: is there any plan about supporting different charsets?
From: Leo Jay i had already hacked the src/ftp-ls.c to meet my need before i posted this thread. but my approach is just hard coding, which i think is not a good way to solve this problem and lack of flexibility. so, i wonder if the wget developers have any plan to solve this problem. and i think their solution must be very elegant (at least than mine). Wget developers are people who develop wget. Anyone can do it. and the attachment is my modification for big5 charset. could you please have a look at it for its correctness? thanks. What is a big5 charset? I can't look for correctness until I know what you're trying to do. You may know what you want, but it's not clear to me. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Re: is there any plan about supporting different charsets?
From: Willener, Pat http://en.wikipedia.org/wiki/Big5 Ok. Thanks for the pointer. From: Leo Jay the attachment is a sample .listing file. I don't know if anyone plans to do anything about multi-byte characters anywhere in wget, and I know that I can't read them, but I see no reason why the existing code (with extensions already suggested) should not be able to handle any byte-character string you specify for a month name, whether or not it makes any sense as byte characters. (One could add an array of different spellings of total, too.) That is, I believe that you could append your big5_months[] strings to the existing months[] array (and add as many other sets (of twelve) as you'd like), and then make changes something like: [...] #define MONTHS_LEN (sizeof( months)/ sizeof( months[ 0])) for (i = 0; i MONTHS_LEN; i++) [...] if (i != MONTHS_LEN) [...] month = i% 12; [...] Assuming that the strings like 26+ 0xa4+ 0xeb are day numbers, it appears that you got pretty lucky with wget's simple-minded day_number-to-integer conversion method. Not much work needed there. Note that a few bytes of storage could be saved by specifying empty strings () instead of duplicates, where other languages look like English. For example: static const char *months[] = { Jan, Feb, Mar, Apr, May, Jun, /* English. */ Jul, Aug, Sep, Oct, Nov, Dec, ,,Mär, ,Mai, , /* German. */ ,,,Okt, ,Dez }; As for getting changes like this into the main development code, I'm probably the wrong person to ask, as I've been trying for years to get a set of VMS-related changes adopted with no obvious success. A while back, another fellow had a similar complaint about German month names: http://www.mail-archive.com/wget@sunsite.dk/msg07775.html I seem to have sent him some private e-mail, but I didn't post anything to the forum at that time. But it does show that there is some interest in this problem other than yours. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547