Hi,
> It appears wget may be creating slightly malformed GZIP skip-length
> fields
I think that's correct: Wget doesn't write the subfield length in the
"extra field" section of the header. After the subfield ID "sl" it
should write the length LEN (see RFC 1952 [1]), but it doesn't.
Luckily,
Tim Rühsen gmx.de> writes:
> Unzipping it and zipping it again results in a 2387 byte file.
>
> So, for a first glimpse, it looks like Wget compresses very suboptimal.
> But I won't say it is a bug before I take a deeper look... (in the next days).
That's probably working as intended. By conven
Am Freitag, 29. März 2013 schrieb Andy Jackson:
> When using wget 1.14 to generate warc.gz files, e.g.
>
> wget -O tempname --warc-file="output" "http://example.com";
>
> the files this creates do not play back well using the Internet Archives
> warc.gz parsers, throwing errors like
>
> "Inva