Hello Gis, just out of curiosity.
What about setting the compiler option -D _FILE_OFFSET_BITS=64 on these systems ? Since off_t is used in many places for file length, there should be many more problems regarding large files. I just wonder how to generally handle large files on these PowerPC and ARM systems. If there is no such general way, using off_t wouldn't make sense (except these systems can't handle large files at all - but then your patch doesn't make sense). Maybe you could bring some light... Regards, Tim Am Monday 12 November 2012 schrieb Gijs van Tulder: > Hi, > > There's a somewhat serious issue in the WARC-generating code: on some > platforms (presumably the ones where off_t is not a 64-bit number) the > Content-Length header at the top of each WARC record has an incorrect > length. On these platforms it is sometimes 0, sometimes 1, but never the > correct length. This makes the whole WARC file unreadable. > > The code works fine on many platforms, but it is apparently a problem on > some PowerPC and ARM systems, and maybe other systems as well. > > Existing WARC files with this problem can be repaired by replacing the > value of the Content-Length header with the correct value, for each WARC > record in the file. The content of the WARC records is there, it's just > the Content-Length header that is wrong. > > The attached patch fixes the problem in warc.c. It replaces off_t by > wgint and uses the number_to_static_string function from util.c. > > Regards, > > Gijs
