Hi Jefferey,

Thanks a lot for your feedback. This is what helps us improve.

* Tim Rühsen <tim.rueh...@gmx.de> [180407 00:01]:
> On 06.04.2018 23:30, Jeffrey Fetterman wrote:
> > Thanks to the fix that Tim posted on gitlab, I've got wget2 running just
> > fine in WSL. Unfortunately it means I don't have TCP Fast Open, but given
> > how fast it's downloading a ton of files at once, it seems like it must've
> > been only a small gain.
> >
TCP Fast Open will not save you a lot in your particular scenario. It simply
saves one round trip when opening a new connection. So, if you're using Wget2
to download a lot of files, you are probably only opening ~5 connections at the
beginning and reusing them all. It depends on your RTT to the server, but 1 RTT
when downloading several megabytes is already an insignificant amount if time.

> >
> > I've come across a few annoyances however.
> >
> > 1. There doesn't seem to be any way to control the size of the download
> > queue, which I dislike because I want to download a lot of large files at
> > once and I wish it'd just focus on a few at a time, rather than over a
> > dozen.
> The number of parallel downloads ? --max-threads=n

I don't think he meant --max-threads. Given how he is using HTTP/2, there's a
chance what he's seeing is HTTP Stream Multiplexing. There is also,
`--http2-request-window` which you can try.
> > 3. Doing a TLS resume will cause a 'Failed to write 305 bytes (32: Broken
> > pipe) error to be thrown', seems to be related to how certificate
> > verification is handled upon resume, but I was worried at first that the
> > WLS problems were rearing their ugly head again.
> Likely the WSL issue is also affecting the TLS layer. TLS resume is
> considered 'insecure',
> thus we have it disabled by default. There still is TLS False Start
> enabled by default.
> > 3. --no-check-certificate causes significantly more errors about how the
> > certificate issuer isn't trusted to be thrown (even though it's not
> > supposed to be doing anything related to certificates).
> Maybe a bit too verbose - these should be warnings, not errors.

@Tim: I thunk with `--no-check-certificate` these should not be either warnings
or errors. The user explicitly stated that they don't care about the validity
of the certificate. Why add any information there at all? Maybe we keep it only
in debug mode
> > 4. --force-progress doesn't seem to do anything despite being recognized as
> > a valid paramater, using it in conjunction with -nv is no longer beneficial.
> You likely want to use --progress=bar. --force-progress is to enable the
> progress bar even when redirecting (e.g. to a log file).
> @Darshit, we shoudl adjust the behavior to be the same as in Wget1.x.

I think the progress bar options are sometimes a little off since we don't have
tests for those and I am the only one using them.

When exactly did you try to use --force-progress? I will change the
documentation today to reflect its actual usecase. --force-progress is useful
only in --quiet mode. Which, TBH, doesn't make much sense to me since simply
--progress=bar will essentially put you in the same mode. AFAIR, this comes
from trying to bring in option compatibility from Wget 1.x.

@Tim: Adjusting behaviour to the same as Wget 1.x doesn't make a lot of sense
for the progress bar. In Wget 1.x, the default mode is: progress bar + verbose.
Whereas, in Wget2, progress-bar will effectively enable the non-verbose mode
where only warnings and errors are printed. I am noting this down for now. When
I have a little time, I will think about all the progress and verbosity options
in Wget 1.x and make sure that they do something similar in Wget2. Though, they
won't have the exact same behaviour.
> > 5. The documentation is unclear as to how to disable things that are
> > enabled by default. Am I to assume that --robots=off is equivalent to -e
> > robots=off?
> -e robots=off should still work. We also allow --robots=off or --no-robots.
> > 6. The documentation doesn't document being able to use 'M' for chunk-size,
> > e.g. --chunk-size=2M
> The wget2 documentation has to be brushed up - one of the blockers for
> the first release.
> >
> > 7. The documentation's instructions regarding --progress is all wrong.
> I'll take a look the next days.

Thanks for the heads up. Will look into it when I look at the rest of the
progress options.
> >
> > 8. The http/https proxy options return as unknown options despite being in
> > the documentation.
> Yeah, the docs... see above. Also, proxy support is currently limited.
> > Lastly I'd like someone to look at the command I've come up with and offer
> > me critiques (and perhaps help me address some of the remarks above if
> > possible).
> No need for --continue.
> Think about using TLS Session Resumption.
> --domains is not needed in your example.

You use TLS Resume, but you don't explicitly need to specify a file. By default
it will use ~/.wget-session.

> Did you build with http/2 and compression support ?
> Regards, Tim
> > #!/bin/bash
> >
> > wget2 \
> >       `#WSL compatibility` \
> >       --restrict-file-names=windows --no-tcp-fastopen \
> >       \
> >       `#No certificate checking` \
> >       --no-check-certificate \
> >       \
> >       `#Scrape the whole site` \
> >       --continue --mirror --adjust-extension \
> >       \
> >       `#Local viewing` \
> >       --convert-links --backup-converted \
> >       \
> >       `#Efficient resuming` \
> >       --tls-resume --tls-session-file=.\tls.session \
> >       \
> >       `#Chunk-based downloading` \
> >       --chunk-size=2M \
> >       \
> >       `#Swiper no swiping` \
> >       --robots=off --random-wait \
> >       \
> >       `#Target` \
> >       --domains=example.com example.com
> >

Thanking You,
Darshit Shah
PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6

Attachment: signature.asc
Description: PGP signature

Reply via email to