In these enlightened times when 2G+ or "large" files are no longer considered large even in the third world, more and more people ask for the ability to download huge files with Wget.
Wget carefully uses `long' for potentially "large" values, such as file sizes and offsets, but that has no effect on the most popular 32-bit architectures, where `long' and `int' are both 32-bit quantities. (It does help on 16-bit architectures where `int' is 16-bit, and it helps under 64-bit "LP64" environments where int is 32-bit, but `long' and `long long' are 64-bit.) There have been several attempts to fix this: * The hack called VERY_LONG_TYPE is used to store values that can be reasonably larger than 2G, such as the sum of all downloads. However, on machines without `long long', VERY_LONG_TYPE will be long. Since it is not used for anything critical, that's not much of a problem (and Wget is careful to detect overflows when adding to the sum, so bogus values are not printed.) * SuSE incorporated patches that change Wget's use of `long' to `unsigned long', which upgraded the limit from 2G to 4G. Aside from all the awkwardness that comes from unsigned arithmetic (checking for error conditions with x<0 doesn't work; you have to use x==-1), its effect is limited: if I want to download a 3G file today, I'll want to download a 5G file tomorrow. * In its own patches, Debian introduced the use of large file APIs and `long long'. While that's perfectly fine for Debian, it is not portable. Neither the large file API nor `long long' are universally available, and both need thorough configure checking. I believe that large numbers and large files are orthogonal. We need a large numeric type to represent numbers that *could* be large, be it the sum of downloaded bytes, remote file sizes, or local file sizes or offsets. Independently, we need to use large file API where available, to be able to write and read large files locally. Of those two issues, choosing and using the numeric type is the hard one. Autoconf helps only to an extent -- even if you define your own `large_number_t' typedef, which is either `long' or `long long', the question remains how to print that number. Even worse, some systems have `long long' (because they use gcc), but don't support it in libc, so printf can't print it. One way to solve this is to define macros for printing types. For example: #ifdef HAVE_LONG_LONG typedef long long large_number_t; # define LN_PRINT "lld" #else typedef double large_number_t; # define LN_PRINT "f" #endif Then this becomes legal code: large_number_t num = 0; printf ("The number is: %" LN_PRINT "!\n", num); Aside from being butt-ugly, this code has two serious problems. 1. Concatenation of adjacent string literals is an ANSI feature and would break pre-ANSI compilers. 2. It breaks gettext. With translation support, the above code would look like this: large_number_t num = 0; printf (_("The number is: %" LN_PRINT "!\n"), num); The message snarfer won't be able to process this because it expects a string literal inside _(...). Even if it were taught about string concatenation, it wouldn't know what to replace LN_PRINT with, unless it ran the preprocessor. And if it ran the preprocessor, it would get non-portable results ("ld" or "f") which cannot be stored to the message catalog. The bottom line is, I really don't know how to solve this portably. Does anyone know how widely ported software deals with large files?