-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[ I've Cc'd the list to invite comment on the idea of overriding
Berkeley Sockets and I/O functions for testing. ]

Yoshihiro Tanaka wrote:

>> Actually, the .px test files already spawn HTTP and/or FTP servers to do
>>  their testing.
>>
>>  Part of the problem with doing testing with a real local server, though,
>>  is that there are a lot of cases I want to handle that involve things
>>  like multiple hostnames/servers. With spawning _real_ servers, the only
>>  name we can really count on is "localhost".
>>
>>  Ideally, I'd like to completely encapsulate the Berkeley Sockets API, so
>>  we can simply swap in a fake one for testing purposes. Then we could do
>>  all the protocol-level tests, without having to involve real servers and
>>  their shortcomings.
> 
> Sorry I'm not sure about it, if we use fake one, how do you test without
> connecting HTTP/FTP server ?

Here's the problem with using real servers for core testing (I do think
we should still use real tests against servers, functionality tests to
ensure that Wget is basically functional: I just don't think we should
rely heavily on it).

With a real remote server, I can't assume that a test failure means that
the test actually failed: it might have been a network failure.

With a real locally-spawned server, I can't test arbitrary hostnames. So
I can't do testing of issues related to --span-hosts, for instance, or
ensure that Wget only sends authorization tokens to the correct server,
and no others, or that Wget never puts the "user:password@" part of URLs
in the Referer header. This can be done with a remote server, but
requires that I set up multiple servers (or at the least, multiple names
for a given server); and as already mentioned I don't like relying on
the availability of remote servers for Wget to run tests.

We _could_ spawn a local "proxy" server, and have it pretend to Wget
that it's forwarding requests to different servers, when it's actually
only pretending; however, ideally we should be explicitly testing these
things both with and without a proxy: some behaviors may change
depending on that (such as the recently reported bug against Wget's HTTP
authentication).

With real servers, it's hard to test how Wget handles rare error codes
from some of the Berkeley Sockets functions or I/O functions.

Overriding socket I/O operations also lets us write tests against cases
when there's a specific number of bytes available for reading, which
actually does correspond to a bug that was fixed not too long ago. It
also lets us dictate exact sequences of server behaviors across multiple
hosts (I can't think of any uses for that for current Wget, though there
probably are some; but there will be plenty once Wget supports multiple
simultaneous connections).

Emulating the networking layer also means that we can write tests for
how Wget handles buggy or quirky servers (especially useful for FTP),
without actually having to have those real servers set up somewhere for
testing (that couldn't be done with locally-spawned servers, either).

> For example, file retrieval is not possible,
> I guess ?

File retrieval should still be possible, by having the fake, emulated
server produce content we specify. Emulating interfaces to the
filesystem here might also be advantageous, so we can emulate systems
that have varying support for large files, without actually having to
require that test-runners have space for them; or to simulate rare I/O
error conditions.

> what do you mean by protocol-level test ?

I mean, checking how Wget handles certain protocol-level situations.
Buggy headers, missing headers, disconnects at specific points in the
conversation, etc.

.

If, in order to support emulation of the networking API, we abstract it
out so we can swap out real sockets with our "testing" layer, we also
gain the advantage that we could later modify it to be swapped out for
other things as well: for instance, true SOCKS support. This sort of
abstraction is something I wanted eventually anyway.

Performing the same abstractions for filesystem emulation would also
provide a point for retrieve-to-tarball (or ".mht") functionality.

In fact, the abstraction itself might make a worthy GSoC proposal -
though it's probably not enough to fill a summer. Perhaps using the
abstraction to then emulate an HTTP server for a few tests would be a
good way to fill out the rest.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH9eda7M8hyUobTrERAjOKAJ9+mifWOi3Gc7LWdVIeakzt2cNcSACfSQFk
/dff3ajYY7sCLyM6SfglR9M=
=SRcY
-----END PGP SIGNATURE-----

Reply via email to