Re: [Bug-wget] --header="Accept-encoding: gzip"

andreas wpv Wed, 23 Sep 2015 19:10:18 -0700

Thanks for the insights. and for working on the next version.
andreas

On Wed, Sep 23, 2015 at 3:10 AM, Tim Ruehsen <[email protected]> wrote:


> > wget --user-agent "Mozilla/5.0 (Windows NT x.y; WOW64; rv:10.0)
> > Gecko/20100101 Firefox/10.0" -e robots=off --header="accept-encoding:
> gzip
> > " -p -H "www.google.com"
> >
> > Still only gives me 52 kb! and one file: index.html
> >
> > So, accept encoding seems to work, but only for the main file?
>
> As Ángel said, the main file is gzipped but wget can't parse it.
> That's why you just get one file (index.html). (This file could be named
> index.html.gz to reflect the content.)
> You could manually gzip -d it and feed the resulting HTML file to wget
> manually, like wget -r --force-html --input-file index.html --base
> www.google.com
>
> There have been patches to support gzip encoding, but either they were
> half-
> baken or the authors did not sign the FSF copyright assignment.
>
> *Note*
> [Meanwhile, we are working on wget2. Content encodings like gzip and
> deflate
> are already built in here. Also lzma and bzip2 for even better compression
> (but servers don't support it out-of-the-box yet).]
>
> Regards, Tim
>
>

Re: [Bug-wget] --header="Accept-encoding: gzip"

Reply via email to