-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [ I've Cc'd the list to invite comment on the idea of overriding Berkeley Sockets and I/O functions for testing. ]
Yoshihiro Tanaka wrote: >> Actually, the .px test files already spawn HTTP and/or FTP servers to do >> their testing. >> >> Part of the problem with doing testing with a real local server, though, >> is that there are a lot of cases I want to handle that involve things >> like multiple hostnames/servers. With spawning _real_ servers, the only >> name we can really count on is "localhost". >> >> Ideally, I'd like to completely encapsulate the Berkeley Sockets API, so >> we can simply swap in a fake one for testing purposes. Then we could do >> all the protocol-level tests, without having to involve real servers and >> their shortcomings. > > Sorry I'm not sure about it, if we use fake one, how do you test without > connecting HTTP/FTP server ? Here's the problem with using real servers for core testing (I do think we should still use real tests against servers, functionality tests to ensure that Wget is basically functional: I just don't think we should rely heavily on it). With a real remote server, I can't assume that a test failure means that the test actually failed: it might have been a network failure. With a real locally-spawned server, I can't test arbitrary hostnames. So I can't do testing of issues related to --span-hosts, for instance, or ensure that Wget only sends authorization tokens to the correct server, and no others, or that Wget never puts the "user:password@" part of URLs in the Referer header. This can be done with a remote server, but requires that I set up multiple servers (or at the least, multiple names for a given server); and as already mentioned I don't like relying on the availability of remote servers for Wget to run tests. We _could_ spawn a local "proxy" server, and have it pretend to Wget that it's forwarding requests to different servers, when it's actually only pretending; however, ideally we should be explicitly testing these things both with and without a proxy: some behaviors may change depending on that (such as the recently reported bug against Wget's HTTP authentication). With real servers, it's hard to test how Wget handles rare error codes from some of the Berkeley Sockets functions or I/O functions. Overriding socket I/O operations also lets us write tests against cases when there's a specific number of bytes available for reading, which actually does correspond to a bug that was fixed not too long ago. It also lets us dictate exact sequences of server behaviors across multiple hosts (I can't think of any uses for that for current Wget, though there probably are some; but there will be plenty once Wget supports multiple simultaneous connections). Emulating the networking layer also means that we can write tests for how Wget handles buggy or quirky servers (especially useful for FTP), without actually having to have those real servers set up somewhere for testing (that couldn't be done with locally-spawned servers, either). > For example, file retrieval is not possible, > I guess ? File retrieval should still be possible, by having the fake, emulated server produce content we specify. Emulating interfaces to the filesystem here might also be advantageous, so we can emulate systems that have varying support for large files, without actually having to require that test-runners have space for them; or to simulate rare I/O error conditions. > what do you mean by protocol-level test ? I mean, checking how Wget handles certain protocol-level situations. Buggy headers, missing headers, disconnects at specific points in the conversation, etc. . If, in order to support emulation of the networking API, we abstract it out so we can swap out real sockets with our "testing" layer, we also gain the advantage that we could later modify it to be swapped out for other things as well: for instance, true SOCKS support. This sort of abstraction is something I wanted eventually anyway. Performing the same abstractions for filesystem emulation would also provide a point for retrieve-to-tarball (or ".mht") functionality. In fact, the abstraction itself might make a worthy GSoC proposal - though it's probably not enough to fill a summer. Perhaps using the abstraction to then emulate an HTTP server for a few tests would be a good way to fill out the rest. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9eda7M8hyUobTrERAjOKAJ9+mifWOi3Gc7LWdVIeakzt2cNcSACfSQFk /dff3ajYY7sCLyM6SfglR9M= =SRcY -----END PGP SIGNATURE-----