Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Micah Cowan wrote: > For my part, I see something which, at least for first cut, I could whip > up in a couple of hours (the server emulation and associated > state-tracking, of course, would be _quite_ a bit more work). What is it > that causes our two perspectives to differ so wildly? Perhaps it's that we're both missing the fact that we already have exactly what I'm talking about: connect.c. That's exactly what I was proposing, but somehow I missed that we were already using an interface between ourselves and the Berkeley Sockets API. It would probably need slightly more abstraction to be suitable for SOCKS (in particular, it'd need to be object-based, so some connections use plain TCP connections while others might use SOCKS); but in the meantime, the emulated server I mentioned could be swapped in for tests as simply as linking against server-emu.o instead of connect.o. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+K6e7M8hyUobTrERAhg8AJ4wsD1x4RgW92Fzx1ilLmQ2wi0CeQCdG7rC eE8NoOmbeOMRAZ//OY3zVmM= =42Fk -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Micah Cowan wrote: > Yeah. But we're not doing streaming. And you still haven't given much > explanation for _why_ it's as hard and time-consuming as you say. Making > a claim and demonstrating it are different things, I think. To be clear, I'm not trying to say, "I don't believe you"; I'm saying, "argue the case, please, don't just make assertions." Clearly, you're concerned about something I'm unable to see: help me to see it! If I ignore your warnings, and wind up running headlong into what you saw in the first place, you can't claim you gave fair warning if you didn't provide examples of what I might run into. For my part, I see something which, at least for first cut, I could whip up in a couple of hours (the server emulation and associated state-tracking, of course, would be _quite_ a bit more work). What is it that causes our two perspectives to differ so wildly? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+KfI7M8hyUobTrERAt4YAKCKSfG/1HtV29mm1MSdDyzFuS8lRQCfdVla EIpSSdKhguieVxgYXln+XiQ= =mMj2 -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hrvoje Niksic wrote: > Micah Cowan <[EMAIL PROTECTED]> writes: > > But > adding a whole new abstraction layer over something as general as > Berkely sockets to facilitate an automated test suite definitely > sounds like ignoring the costs of such an abstraction layer. Facilitate a test suite _and_ pave the way for more general connection handling, don't forget. Proper SOCKS (and other) support is a current goal; I don't see how we can handle it without an additional interface layer. > I have no idea what "file descriptor silliness" with values 0-2 you're > referring to. :-) I do agree that an application-specific struct is > better than a more general abstraction because it is easier to design > and more useful to Wget in the long run. Hm, I was actually thinking the struct (or, more specifically, a pointer to the struct) was the more general abstraction. :) By "file descriptor silliness", I really meant the use of a single pool of possible ints, which is shared across a large number of fairly different applications. Not to say that file descriptors was a bad idea for Unix, but rather that emulating them would be sloppy. > Because implementing a file I/O abstraction is much harder and more > time-consuming than it sounds. To paraphrase Greenspun, it would > appear that every sufficiently large code base contains an ad-hoc, > informally-specified, bug-ridden implementation of a streaming layer. > There are streaming libraries out there; maybe we should consider > using some of them. Yeah. But we're not doing streaming. And you still haven't given much explanation for _why_ it's as hard and time-consuming as you say. Making a claim and demonstrating it are different things, I think. All we've to do for a testing-capable replacement interface is replace calls to socket, connect, (socket) read/write, shutdown/close with equivalents. We don't even have to do the other calls to start with, nor do we even need to use the new layer everywhere (just in http.c to start, I believe). Yes, there's slightly more to it than that, but not earth-shatteringly more. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+KR47M8hyUobTrERAiQTAJ9472Wju3vP4GqDDEaHRBM32PISFgCfSp6v zuUlnXtnA6sgag2FmzfNUcE= =FC2o -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yoshihiro Tanaka wrote: > By the way, How about using LD_PRELOAD ? Too specific to Linux/ld-linux.so, I think. Ideally, the test solution should work on all supported platforms for Wget (including Windows). - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+J+17M8hyUobTrERAsuSAJ4i9uzvxWbyEPkiyw+e6XWwvUS6GgCeMWqg Mcrq0YNLiBkUh+7fzR9CFHk= =yqRw -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
Micah Cowan <[EMAIL PROTECTED]> writes: > I don't see what you see wrt making the code harder to follow and reason > about (true abstraction rarely does, AFAICT, I was referring to the fact that adding an abstraction layer requires learning about the abstraction layer, both its concepts and its implementation, including its quirks and limitations. Too general abstractions added to application software are typically to be underspecified (for the domain they attempt to cover) and incomplete. Programmers tend to ignore the hidden cost of adding an abstraction layer until the cost becomes apparent, by which time it is too late. Application-specific abstractions are usually worth it because they are well-justified: they directly benefit the application by making the code base simpler and removing duplication. Some general abstractions are worth it because the alternative is worse; you wouldn't want to have two versions of SSL-using code, one for regular sockets, and one for SSL, since the whole point of SSL is that you're supposed to use it "as if" it were sockets behind the scenes. But adding a whole new abstraction layer over something as general as Berkely sockets to facilitate an automated test suite definitely sounds like ignoring the costs of such an abstraction layer. > I _am_ thinking that it'd probably be best to forgo the idea of > one-to-one correspondence of Berkeley sockets, and pass around a "struct > net_connector *" (and "struct net_listener *"), so we're not forced to > deal with file descriptor silliness (where obviously we'd have wanted to > avoid the values 0 through 2, and I was even thinking it might > _possibly_ be worthwhile to allocate real file descriptors to get the > numbers, just to avoid clashes). I have no idea what "file descriptor silliness" with values 0-2 you're referring to. :-) I do agree that an application-specific struct is better than a more general abstraction because it is easier to design and more useful to Wget in the long run. >>> This would mean we'd need to separate uses of read() and write() on >>> normal files (which should continue to use the real calls, until we >>> replace them with the file I/O abstractions), from uses of read(), >>> write(), etc on sockets, which would be using our emulated versions. >> >> Unless you're willing to spend a lot of time in careful design of >> these abstractions, I think this is a mistake. > > Why? Because implementing a file I/O abstraction is much harder and more time-consuming than it sounds. To paraphrase Greenspun, it would appear that every sufficiently large code base contains an ad-hoc, informally-specified, bug-ridden implementation of a streaming layer. There are streaming libraries out there; maybe we should consider using some of them.
Re: About Automated Unit Test for Wget
2008/4/5, Micah Cowan <[EMAIL PROTECTED]>: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Daniel Stenberg wrote: > > This system allows us to write unit-tests if we'd like to, but mostly so > > far we've focused to test it system-wide. It is hard enough for us! > > > Yeah, I thought I'd seen something like that; I was thinking we might > even be able to appropriate some of that, if that looked doable. Except > that I preferred faking the server completely, so I could deal better > with cross-site issues, which AFAICT are significantly more important to > Wget than they are to Curl. > It seems that abstraction of network API needs more discussion, so I would focus on the server emulation By the way, How about using LD_PRELOAD ? I tested a little and it seems to be working. If we use this, we can test by overriding socket interface, and still we don't change wget real source code. I found this way on the net, and sample was using wget !! they are overriding socket, close, connect. --main.c -- #include int main(void) { puts("Helow Wgets\n"); return 0; } --testputs.c #include int puts(const char *str) { while(*str) putchar(*str++); printf("This is a test module"); putchar('\n'); } - --Compile like below: [EMAIL PROTECTED] Test]$ gcc main.c -o main [EMAIL PROTECTED] Test]$ gcc -fPIC -shared -o testputs.so testputs.c --Execute like below: [EMAIL PROTECTED] Test]$ ./main Helow Wgets [EMAIL PROTECTED] Test]$ LD_PRELOAD=./testputs.so ./main Helow Wgets This is a test module -- Yoshihiro TANAKA SFSU CS Department
Re: About Automated Unit Test for Wget
2008/4/5, Micah Cowan <[EMAIL PROTECTED]>: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Daniel Stenberg wrote: > > This system allows us to write unit-tests if we'd like to, but mostly so > > far we've focused to test it system-wide. It is hard enough for us! > > > Yeah, I thought I'd seen something like that; I was thinking we might > even be able to appropriate some of that, if that looked doable. Except > that I preferred faking the server completely, so I could deal better > with cross-site issues, which AFAICT are significantly more important to > Wget than they are to Curl. > It seems that abstraction of network API needs more discussion, so I would focus on the server emulation By the way, How about using LD_PRELOAD ? I tested a little and it seems to be working. If we use this, we can test by overriding socket interface, and still we don't change wget real source code. --main.c -- #include int main(void) { puts("Helow Wgets\n"); return 0; } --testputs.c #include int puts(const char *str) { while(*str) putchar(*str++); printf("This is a test module"); putchar('\n'); } - --Compile like below: [EMAIL PROTECTED] Test]$ gcc main.c -o main [EMAIL PROTECTED] Test]$ gcc -fPIC -shared -o testputs.so testputs.c --Execute like below: [EMAIL PROTECTED] Test]$ ./main Helow Wgets [EMAIL PROTECTED] Test]$ LD_PRELOAD=./testputs.so ./main Helow Wgets This is a test module -- I found this way on the net, and sample was using wget !! they are overriding socket, close, and connect. http://www.t-dori.net/forensics/hook_tcp.cpp -- Yoshihiro TANAKA
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hrvoje Niksic wrote: > Micah Cowan <[EMAIL PROTECTED]> writes: > >>> Or did you mean to write wget version of socket interface? i.e. to >>> write our version of socket, connect,write,read,close,bind, >>> listen,accept,,,? sorry I'm confused. >> Yes! That's what I meant. (Except, we don't need listen, accept; and >> we only need bind to support --bind-address. We're a client, not a >> server. ;) ) >> >> It would be enough to write function-pointers for (say), wg_socket, >> wg_connect, wg_sock_write, wg_sock_read, etc, etc, and point them at >> system socket, connect, etc for "real" Wget, but at wg_test_socket, >> wg_test_connect, etc for our emulated servers. > > This seems like a neat idea, but it should be carefully weighed > against the drawbacks. Adding an ad-hoc abstraction layer is harder > than it sounds, and has more repercussions than is immediately > obvious. An underspecified, unfinished abstraction layer over sockets > makes the code harder, not easier, to follow and reason about. You no > longer deal with BSD sockets, you deal with an abstraction over them. > Is it okay to call getsockname on such a socket? How about > setsockopt? What about the listen/bind mechanism (which we do need, > as Daniel points out)? I'm having some trouble seeing how most of those present problems. Obviously, you wouldn't call _any_ system functions on these, so yeah, no setsockopt() unless it's a wg_setsockopt() (a wg_setsockopt would probably be a poor way to handle it anyway, as it'd be mainly true-TCP specific). I don't see what you see wrt making the code harder to follow and reason about (true abstraction rarely does, AFAICT, though there are some counter-examples, usually of things that are much, much more abstract than we are used to thinking about). Did you have some specific concerns? I _am_ thinking that it'd probably be best to forgo the idea of one-to-one correspondence of Berkeley sockets, and pass around a "struct net_connector *" (and "struct net_listener *"), so we're not forced to deal with file descriptor silliness (where obviously we'd have wanted to avoid the values 0 through 2, and I was even thinking it might _possibly_ be worthwhile to allocate real file descriptors to get the numbers, just to avoid clashes). Then we can focus on actual abstraction (which we don't obtain by emulating Berkeley sockets), rather than just emulation. While Daniel was of course right that we'd need listen, accept, etc, we _wouldn't_ need them to begin using this layer to test against http.c. We wouldn't even need bind, if we didn't include --bind-address in our first tests of the http code. >> This would mean we'd need to separate uses of read() and write() on >> normal files (which should continue to use the real calls, until we >> replace them with the file I/O abstractions), from uses of read(), >> write(), etc on sockets, which would be using our emulated versions. > > Unless you're willing to spend a lot of time in careful design of > these abstractions, I think this is a mistake. Why? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+Eoi7M8hyUobTrERAj3VAJ4vb/SPNkNo+Xyd2Hq09U4ey6zJJwCfVmG0 NSVpzr7IEdpUQkTwy/j2z9E= =9lKJ -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Daniel Stenberg wrote: > In the curl project we took a simpler route: we have our own dumb test > servers in the test suite to run tests against and we have single files > that describe each test case: what the server should respond, what the > protocol dump should look like, what output to expect, what return code, > etc. Then we have a script that reads the test case description, fires > up the correct server(s), verifies > all the ouputs (optionally using valgrind). > > This system allows us to write unit-tests if we'd like to, but mostly so > far we've focused to test it system-wide. It is hard enough for us! Yeah, I thought I'd seen something like that; I was thinking we might even be able to appropriate some of that, if that looked doable. Except that I preferred faking the server completely, so I could deal better with cross-site issues, which AFAICT are significantly more important to Wget than they are to Curl. I was thinking, and should have said, that if we go this route, we'd want to focus on high-level tests first. That also has the advantage that if we accidentally change something during the refactoring process (not unlikely), we will notice it, whereas focusing just on unit tests would mean we'd have to change the code to be testable in units _before_ verification. We already _do_ have some spawn-a-server tests code, but much of it needs rewriting, and it still suffers when you bring in the idea of multiple servers. The servers are driven by Perl code, rather than a driver script or description file. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+EZz7M8hyUobTrERAjDxAJ9N3AbEVG6NTy735hy6KtjPO7jm8wCdFX+/ gLx9jZcp0ZQqE2bQAU7VdyQ= =u+PC -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Daniel Stenberg wrote: > On Sat, 5 Apr 2008, Micah Cowan wrote: > >>> Or did you mean to write wget version of socket interface? i.e. to >>> write our version of socket, connect,write,read,close,bind, >>> listen,accept,,,? sorry I'm confused. >> >> Yes! That's what I meant. (Except, we don't need listen, accept; and >> we only need bind to support --bind-address. We're a client, not a >> server. ;) ) > > Except, you do need listen, accept and bind in a server sense since even > if wget is a client I believe it still supports the PORT command for ftp... Damn FTP... :) Yeah, of course. Sorry, my view of the web tends frequently to be very HTTP-colored. :) (Well, technically, that _is_ the WWW, but anyway...) - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH+ENm7M8hyUobTrERAlewAJ9W+vriWeVptJWG72Q3F0Njpt9TZgCfeZI4 An3zovMEfIEd1W1o7hqe5q0= =TKsW -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
On Sat, 5 Apr 2008, Hrvoje Niksic wrote: This would mean we'd need to separate uses of read() and write() on normal files (which should continue to use the real calls, until we replace them with the file I/O abstractions), from uses of read(), write(), etc on sockets, which would be using our emulated versions. Unless you're willing to spend a lot of time in careful design of these abstractions, I think this is a mistake. Related: In the curl project we took a simpler route: we have our own dumb test servers in the test suite to run tests against and we have single files that describe each test case: what the server should respond, what the protocol dump should look like, what output to expect, what return code, etc. Then we have a script that reads the test case description, fires up the correct server(s), verifies all the ouputs (optionally using valgrind). This system allows us to write unit-tests if we'd like to, but mostly so far we've focused to test it system-wide. It is hard enough for us!
Re: About Automated Unit Test for Wget
Micah Cowan <[EMAIL PROTECTED]> writes: >> Or did you mean to write wget version of socket interface? i.e. to >> write our version of socket, connect,write,read,close,bind, >> listen,accept,,,? sorry I'm confused. > > Yes! That's what I meant. (Except, we don't need listen, accept; and > we only need bind to support --bind-address. We're a client, not a > server. ;) ) > > It would be enough to write function-pointers for (say), wg_socket, > wg_connect, wg_sock_write, wg_sock_read, etc, etc, and point them at > system socket, connect, etc for "real" Wget, but at wg_test_socket, > wg_test_connect, etc for our emulated servers. This seems like a neat idea, but it should be carefully weighed against the drawbacks. Adding an ad-hoc abstraction layer is harder than it sounds, and has more repercussions than is immediately obvious. An underspecified, unfinished abstraction layer over sockets makes the code harder, not easier, to follow and reason about. You no longer deal with BSD sockets, you deal with an abstraction over them. Is it okay to call getsockname on such a socket? How about setsockopt? What about the listen/bind mechanism (which we do need, as Daniel points out)? > This would mean we'd need to separate uses of read() and write() on > normal files (which should continue to use the real calls, until we > replace them with the file I/O abstractions), from uses of read(), > write(), etc on sockets, which would be using our emulated versions. Unless you're willing to spend a lot of time in careful design of these abstractions, I think this is a mistake.
Re: About Automated Unit Test for Wget
On Sat, 5 Apr 2008, Micah Cowan wrote: Or did you mean to write wget version of socket interface? i.e. to write our version of socket, connect,write,read,close,bind, listen,accept,,,? sorry I'm confused. Yes! That's what I meant. (Except, we don't need listen, accept; and we only need bind to support --bind-address. We're a client, not a server. ;) ) Except, you do need listen, accept and bind in a server sense since even if wget is a client I believe it still supports the PORT command for ftp...
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yoshihiro Tanaka wrote: > 2008/4/5, Yoshihiro Tanaka <[EMAIL PROTECTED]>: >> Yes, since I want to write proposal for Unit testing, I can't skip this >> problem. But considering GSoC program is only 2 month, I'd rather narrow >> down the target - to gethttp funcion. I have a sneaking suspicion that some chunks of functionality that you'd want to farm out in gethttp, also have code-change repurcussions elsewhere (probably http_loop usually). So it may be difficult to "restrict" yourself to gethttp. :) Probably better to identify the specific chunks of logic that can be farmed out, find out how far-reaching separating those chunks might be, and choose some specific ones to do. You've already identified some areas; I'll comment those when I have a chance to look more closely at the code, for comparison with your remarks. >> In addition to above, we have to think about abstraction of >> network API and file I/O API. >> >> But network API(such as fd_read_body, fd_read_hunk) exists in >> retr.c, and socket is opened in connect.c file, it looks that >> abstraction of network API would require major modification of >> interfaces. > > Or did you mean to write wget version of socket interface? > i.e. to write our version of socket, connect,write,read,close,bind, > listen,accept,,,? sorry I'm confused. Yes! That's what I meant. (Except, we don't need listen, accept; and we only need bind to support --bind-address. We're a client, not a server. ;) ) It would be enough to write function-pointers for (say), wg_socket, wg_connect, wg_sock_write, wg_sock_read, etc, etc, and point them at system socket, connect, etc for "real" Wget, but at wg_test_socket, wg_test_connect, etc for our emulated servers. This would mean we'd need to separate uses of read() and write() on normal files (which should continue to use the real calls, until we replace them with the file I/O abstractions), from uses of read(), write(), etc on sockets, which would be using our emulated versions. Ideally, we'd replace the use of file descriptor ints with a more opaque mechanism; but that can be done later. If you'd prefer, you might choose to write a proposal focusing on the server emulation, which would easily take up a summer of itself (and then some); particularly when you realize that we would need a file format describing the virtual server's state (what domains and URLs exist, what sort of headers it should respond with to certain requests, etc). If you chose to take on, you'd probably need to settle for a subset of the final expected product. Note that, down the road, we'll want to encapsulate the whole sockets-layer abstraction into an object we'd pass around as an argument (struct net_connector * ?), as we might want to use it to handle SOCKS for some URLs, while using direct connections for others. But that doesn't have to happen right now; once we've got the actual abstraction done it should be pretty easy to move it to an object-based mechanism (just use conn->connect(...) instead of wg_connect(...)). But, if you want to go ahead and do that now, that'd be great too. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9+7p7M8hyUobTrERApu6AKCENiEExoyTHxDUodnr/AIcRx8BOgCcD89N k6ANTdl+4fgb+4trcADXnO0= =fmya -END PGP SIGNATURE-
Re: About Automated Unit Test for Wget
2008/4/5, Yoshihiro Tanaka <[EMAIL PROTECTED]>: > 2008/4/4, Micah Cowan <[EMAIL PROTECTED]>: > > > > > IMO, if it's worth testing, it's probably better to have external > > linkage anyway. > > > I got it. > > > > > > 1) Select functions which can be tested in unit test. > > > But How can I select them? is difficult. > > > Basically the less dependency the function has, more easy to > > > include in unit test, but about boundary line, I'm not sure. > > > > > > This is precisely the problem, and one reason I've been thinking that > > this might not make an ideal topic for a GSoC proposal, unless you want > > to include refactoring existing functions like gethttp and http_loop > > into more logically discreet sets of functions. Essentially, to get > > better coverage of the code that needs it the most, that code will need > > to be rewritten. I believe this can be an iterative process (find one > > function to factor out, write a unit test for it, make it work...). > > > Yes, since I want to write proposal for Unit testing, I can't skip this > problem. But considering GSoC program is only 2 month, I'd rather narrow > down the target - to gethttp funcion. > > Although I'm not well aware of source code, > I'm thinking like below: > > In gethttp function there are roughly six chunk of functionality. > > 1.preparation of request > > 2.making header part of HTTP >proxy_auth >generate_hosthead >, and other process to make header > > 3.connection >persistent_available_p >establishment of connection to host >ssl_connection process > > 4.http request process >request send >read request response >checking status codes > > 5.local file - related process ( a bunch of process...) >deterimine filename >file existence check >noclobber, -O check >timestamping check >Content-Length check >Keep-Alive response check >Authorize process >Set-cookie header >Content-Range check >filename dealing (HTML Extention) >416 status code dealing >open local file > > 6.download body part & writing into local file > > > > So, Basically I think it could be divided into these functionality. > And after that each functionality would be divided into more > small pieces to the extent that unit tests can be done separately. > > In addition to above, we have to think about abstraction of > network API and file I/O API. > > But network API(such as fd_read_body, fd_read_hunk) exists in > retr.c, and socket is opened in connect.c file, it looks that > abstraction of network API would require major modification of > interfaces. Or did you mean to write wget version of socket interface? i.e. to write our version of socket, connect,write,read,close,bind, listen,accept,,,? sorry I'm confused. > > And design of this would not be proper for me, I think. > So what I want to suggest is that I want to ask interface _design_. > How do you think ? At least I want to narrow down the scope within > I can take responsiblity. > > > > > What I'm most keenly interested in, is the ability to verify the logic > > of how follow/don't-follow is decided (that actually may not be too hard > > to write tests against now), how Wget handles various protocol-level > > situations, how it chooses the filename and deals with the local > > filesystem, etc. I will be very, _very_ happy when everything that's in > > http_loop and gethttp is verified by unit tests. > > > > But a lot of getting to where we can test that may mean abstracting out > > things like the Berkeley Sockets API and filesystem interactions, so > > that we can drop in "fake" replacements for testing. > > > > > I'd like to try, if we could settle down the problem of interface design... > > > > > I'm familiar with a framework called (simply) "Check", which might be > > worth considering. It forks a new process for each test, which isolates > > it from interfering with the other tests, and also provides a clean way > > to handle things like segmentation violations or aborts. However, it's > > intended for Unix, and probably doesn't compile on other systems. > > > > http://check.sourceforge.net/ > > > Thank you for your information :) > > > -- > > Yoshihiro TANAKA > -- Yoshihiro TANAKA
Re: About Automated Unit Test for Wget
2008/4/4, Micah Cowan <[EMAIL PROTECTED]>: > > IMO, if it's worth testing, it's probably better to have external > linkage anyway. I got it. > > 1) Select functions which can be tested in unit test. > > But How can I select them? is difficult. > > Basically the less dependency the function has, more easy to > > include in unit test, but about boundary line, I'm not sure. > > > This is precisely the problem, and one reason I've been thinking that > this might not make an ideal topic for a GSoC proposal, unless you want > to include refactoring existing functions like gethttp and http_loop > into more logically discreet sets of functions. Essentially, to get > better coverage of the code that needs it the most, that code will need > to be rewritten. I believe this can be an iterative process (find one > function to factor out, write a unit test for it, make it work...). Yes, since I want to write proposal for Unit testing, I can't skip this problem. But considering GSoC program is only 2 month, I'd rather narrow down the target - to gethttp funcion. Although I'm not well aware of source code, I'm thinking like below: In gethttp function there are roughly six chunk of functionality. 1.preparation of request 2.making header part of HTTP proxy_auth generate_hosthead , and other process to make header 3.connection persistent_available_p establishment of connection to host ssl_connection process 4.http request process request send read request response checking status codes 5.local file - related process ( a bunch of process...) deterimine filename file existence check noclobber, -O check timestamping check Content-Length check Keep-Alive response check Authorize process Set-cookie header Content-Range check filename dealing (HTML Extention) 416 status code dealing open local file 6.download body part & writing into local file So, Basically I think it could be divided into these functionality. And after that each functionality would be divided into more small pieces to the extent that unit tests can be done separately. In addition to above, we have to think about abstraction of network API and file I/O API. But network API(such as fd_read_body, fd_read_hunk) exists in retr.c, and socket is opened in connect.c file, it looks that abstraction of network API would require major modification of interfaces. And design of this would not be proper for me, I think. So what I want to suggest is that I want to ask interface _design_. How do you think ? At least I want to narrow down the scope within I can take responsiblity. > What I'm most keenly interested in, is the ability to verify the logic > of how follow/don't-follow is decided (that actually may not be too hard > to write tests against now), how Wget handles various protocol-level > situations, how it chooses the filename and deals with the local > filesystem, etc. I will be very, _very_ happy when everything that's in > http_loop and gethttp is verified by unit tests. > > But a lot of getting to where we can test that may mean abstracting out > things like the Berkeley Sockets API and filesystem interactions, so > that we can drop in "fake" replacements for testing. > I'd like to try, if we could settle down the problem of interface design... > I'm familiar with a framework called (simply) "Check", which might be > worth considering. It forks a new process for each test, which isolates > it from interfering with the other tests, and also provides a clean way > to handle things like segmentation violations or aborts. However, it's > intended for Unix, and probably doesn't compile on other systems. > > http://check.sourceforge.net/ Thank you for your information :) -- Yoshihiro TANAKA
Re: wget 1.11.1 make test fails
On Thursday, April 3, 2008 at 9:14:52 -0700, Micah Cowan wrote: > Are you certain you rebuilt cmpt.o? This seems pretty unlikely, to me. Certain: make test after touching src/sysdep.h rebuilds both cmpt.o, the normal in src/ and the one in tests/. And both those cmpt.o become 784 bytes bigger without SYSTEM_FNMATCH. Alain.
Re: wget 1.11.1 make test fails
On Thursday, April 3, 2008 at 22:37:41 +0200, Hrvoje Niksic wrote: > Or it could be that you're picking up a different fnmatch.h that sets > up a different value for FNM_PATHNAME. Do you have more than one > "fnmatch.h" installed on your system? I have only /usr/include/fnmatch.h installed, identical to the file in the libc-5.4.33 tarball, and defining the same values as wget's src/sysdep.h (even comments are identical). Just "my" fnmatch.h defines two more flags, FNM_LEADING_DIR=8 and FNM_CASEFOLD=16, and defines an FNM_FILE_NAME alias (commented as "Preferred GNU name") to FNM_PATHNAME=1 (the libc code uses only this alias). Anyway I had noticed your comment about incompatible headers, and double-checked your little test program also with explicit value 1: same results. BTW everybody should be able to reproduce the make test failure, on any system, just by #undefining SYSTEM_FNMATCH in src/sysdep.h Alain.
Re: wget 1.11.1 make test fails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hrvoje Niksic wrote: > Alain Guibert <[EMAIL PROTECTED]> writes: > >>> Maybe you could put a breakpoint in fnmatch and see what goes wrong? >> The for loop intended to eat several characters from the string also >> advances the pattern pointer. This one reaches the end of the pattern, >> and points to a NUL. It is not a '*' anymore, so the loop exits >> prematurely. Just below, a test for NUL returns 0. > > Thanks for the analysis. Looking at the current fnmatch code in > gnulib, it seems that the fix is to change that NUL test to something > like: > > if (c == '\0') > { > /* The wildcard(s) is/are the last element of the pattern. > If the name is a file name and contains another slash > this means it cannot match. */ > int result = (flags & FNM_PATHNAME) == 0 ? 0 : FNM_NOMATCH; > if (flags & FNM_PATHNAME) > { > if (!strchr (n, '/')) > result = 0; > } > return result; > } > > But I'm not at all sure that it covers all the needed cases. I'm thinking not: the loop still shouldn't be incrementing n, since that forces each additional * to match at least one character, doesn't it? Gnulib's version seems to handle that better. > Maybe we > should simply switch to gnulib-provided fnmatch? Unfortunately that > one is quite complex and quite hard for the '**' extension Micah > envisions. There might be other fnmatch implementations out there in > GNU which are debugged but still simpler than the gnulib/glibc one. Maybe. I'm not sure ** would be too hard to add to gnulib's fnmatch, just have to toggle with the FNM_FILE_NAME tests within the '*' case, if we see an immediate second '*'. But maybe ** as part of a *?**? sequence is more complex. I don't think so, though. The main thing is that we need it to support the invalid sequence stuff. Hm; I'm not sure we'll ever want fnmatch() to be locale-aware, though. User-specified match patterns should interpret characters based on the locale; but the source strings may be in different encodings altogether. If we solve this by transcoding to the current locale, we may find that the user's locale doesn't support all of the characters that the original string's encoding does. Probably we'll need to transcode both to Unicode before comparison. In the meantime, though, I think we want a simple byte-by-byte match. Perhaps it's best to (a) use our custom matcher, ignoring the system's (so we don't get locale specialness), and (b) fix it, providing as thorough test coverage as possible. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9jWi7M8hyUobTrERAglwAKCDnpnDjr44Ovgh/oBuzkM4mu/gKACeNnN8 arvFSrCEBatNeO29fzHxuU4= =QDMp -END PGP SIGNATURE-
Re: wget 1.11.1 make test fails
Alain Guibert <[EMAIL PROTECTED]> writes: >> Maybe you could put a breakpoint in fnmatch and see what goes wrong? > > The for loop intended to eat several characters from the string also > advances the pattern pointer. This one reaches the end of the pattern, > and points to a NUL. It is not a '*' anymore, so the loop exits > prematurely. Just below, a test for NUL returns 0. Thanks for the analysis. Looking at the current fnmatch code in gnulib, it seems that the fix is to change that NUL test to something like: if (c == '\0') { /* The wildcard(s) is/are the last element of the pattern. If the name is a file name and contains another slash this means it cannot match. */ int result = (flags & FNM_PATHNAME) == 0 ? 0 : FNM_NOMATCH; if (flags & FNM_PATHNAME) { if (!strchr (n, '/')) result = 0; } return result; } But I'm not at all sure that it covers all the needed cases. Maybe we should simply switch to gnulib-provided fnmatch? Unfortunately that one is quite complex and quite hard for the '**' extension Micah envisions. There might be other fnmatch implementations out there in GNU which are debugged but still simpler than the gnulib/glibc one. It's kind of ironic that while the various system fnmatches were considered broken, the one Wget was using (for many years unconditionally!) was also broken.
Re: wget 1.11.1 make test fails
Alain Guibert <[EMAIL PROTECTED]> writes: > On Wednesday, April 2, 2008 at 23:09:52 +0200, Hrvoje Niksic wrote: > >> Micah Cowan <[EMAIL PROTECTED]> writes: >>> It's hard for me to imagine an fnmatch that ignores FNM_PATHNAME > > The libc 5.4.33 fnmatch() supports FNM_PATHNAME, and there is code > apparently intending to return FNM_NOMATCH on a slash. But this code > seems to be rather broken. Or it could be that you're picking up a different fnmatch.h that sets up a different value for FNM_PATHNAME. Do you have more than one "fnmatch.h" installed on your system?
Re: About Automated Unit Test for Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yoshihiro Tanaka wrote: > Hello, I want to ask about Unit test of Wget in the future. > I want to ask about unit test. > > Now unit test of Wget is written only for following .c files. > -- http.c init.c main.c res.c url.c utils.c (test.c) > > So as written in Wiki, new unit test suite is necessary. >(ref. http://wget.addictivecode.org/FeatureSpecifications/Testing) Well, or expansions to the existing one. However, it's my desire (as expressed on that page) that the test code be separated from the .c files containing the code-to-test. This may mean making some functions that are currently "static" to be externally linked. IMO, if it's worth testing, it's probably better to have external linkage anyway. > In order to make new unit test suite, I think following work is necessary. > > 1) Select functions which can be tested in unit test. > But How can I select them? is difficult. > Basically the less dependency the function has, more easy to > include in unit test, but about boundary line, I'm not sure. This is precisely the problem, and one reason I've been thinking that this might not make an ideal topic for a GSoC proposal, unless you want to include refactoring existing functions like gethttp and http_loop into more logically discreet sets of functions. Essentially, to get better coverage of the code that needs it the most, that code will need to be rewritten. I believe this can be an iterative process (find one function to factor out, write a unit test for it, make it work...). > 2) In order to do above 1), How about Making a list of all functions > in Wget? and maintain? > > The advantage of 2) is ... > * make clear which function is included in Unit test > * make clear which function is _not_ in Unit test > * make easy to manage testing > * make easy to devide testing work Hm... I'm not sure that the benefits are worth the effort. If we _really_ wanted this, I'd propose that we use a naming convention (or processed comment, etc) for the unit test functions so that the list of functions that are covered can be determined automatically by a program; the ones that aren't covered would be any functions remaining. However, I personally wouldn't find this useful. I don't intend that every function in existence has to have a unit test covering it. Some functions will have already been tested through the exercise of higher-level calling functions (in which case they should probably have internal linkage); others may have been tested through the exercise of the functions it calls. What I'm most keenly interested in, is the ability to verify the logic of how follow/don't-follow is decided (that actually may not be too hard to write tests against now), how Wget handles various protocol-level situations, how it chooses the filename and deals with the local filesystem, etc. I will be very, _very_ happy when everything that's in http_loop and gethttp is verified by unit tests. But a lot of getting to where we can test that may mean abstracting out things like the Berkeley Sockets API and filesystem interactions, so that we can drop in "fake" replacements for testing. > (test tools, other preliminary work for unit test, how manage it ...) There is an incredibly basic test framework, completely defined in src/test.h. See src/test.c for how it is being used. I'm familiar with a framework called (simply) "Check", which might be worth considering. It forks a new process for each test, which isolates it from interfering with the other tests, and also provides a clean way to handle things like segmentation violations or aborts. However, it's intended for Unix, and probably doesn't compile on other systems. http://check.sourceforge.net/ - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9d0Q7M8hyUobTrERApdnAJ905n4j0oglUHekP6MJaE4dBCEw+QCeL4RE 0lnwnZgHQjSEom4f1MfiviM= =UejZ -END PGP SIGNATURE-
About Automated Unit Test for Wget
Hello, I want to ask about Unit test of Wget in the future. I want to ask about unit test. Now unit test of Wget is written only for following .c files. -- http.c init.c main.c res.c url.c utils.c (test.c) So as written in Wiki, new unit test suite is necessary. (ref. http://wget.addictivecode.org/FeatureSpecifications/Testing) In order to make new unit test suite, I think following work is necessary. 1) Select functions which can be tested in unit test. But How can I select them? is difficult. Basically the less dependency the function has, more easy to include in unit test, but about boundary line, I'm not sure. 2) In order to do above 1), How about Making a list of all functions in Wget? and maintain? The advantage of 2) is ... * make clear which function is included in Unit test * make clear which function is _not_ in Unit test * make easy to manage testing * make easy to devide testing work So once this list is done, it would become easier to maintain testing schedule, progress, etc.. And when Unit test suite is done, this list should be able to be generated automatically... and to do regression test, all we do is just run Unit test again :) 3) Contents of list I come up is following: * Wget version num * Filename * function name * Included in Unit Test or not * Simple Call graph of the function So let me ask your opinions. And is there any suggestion about unit test of Wget? (test tools, other preliminary work for unit test, how manage it ...) Thank you for your time. -- Yoshihiro TANAKA
fnmatch [Re: wget 1.11.1 make test fails]
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Alain Guibert wrote: > The for loop intended to eat several characters from the string also > advances the pattern pointer. This one reaches the end of the pattern, > and points to a NUL. It is not a '*' anymore, so the loop exits > prematurely. Just below, a test for NUL returns 0. > > The body of the loop, returning FNM_NOMATCH on a slash, is not executed > at all. That isn't moderately broken, is it? I haven't stepped through it, but it sure looks broken to my eyes too. I am tired at the moment, though, so may be missing something. GNUlib has an fnmatch, which might be worth considering for use; but AIUI it suffers from the same overly-locale-aware problem that system fnmatches can suffer from (fnmatch fails when the string isn't encoded properly for the current locale; we often don't even _know_ the original encoding, especially for FTP, and mainly want * to match any arbitrary string of byte values). They were looking for someone to address that issue: http://lists.gnu.org/archive/html/bug-gnulib/2008-02/msg00019.html Perhaps, if I'm motivated and somehow scrounge the time, I can fix the problem in their code, and then use it in ours? :) Or, if someone else with more time would like to tackle it, I'm sure that'd also be welcome. :) I responded to the message linked above with a note that Wget also had a need for such functionality, along with some questions about the approach, but hadn't received a response. Maybe I'll try again. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9XBy7M8hyUobTrERAtReAJ94Ac0ClInQOE7qq7OQxon87zj7JACeOTz3 Lfafi0U2phRDnFqQ2IPSx+s= =9yU/ -END PGP SIGNATURE-
Re: wget 1.11.1 make test fails
On Thursday, April 3, 2008 at 11:08:27 +0200, Hrvoje Niksic wrote: > Well, it would point to a problem with both the fnmatch replacement > and the older system fnmatch. "Our" fnmatch (coming from an old > release of Bash The fnmatch()es in libc 5.4.33 and in Wget are twins. They differ on some minor details like FNM_CASEFOLD support, and cosmetic things like parenthesis around return(code). The part dealing with * in pattern is functionaly identical. > Maybe you could put a breakpoint in fnmatch and see what goes wrong? The for loop intended to eat several characters from the string also advances the pattern pointer. This one reaches the end of the pattern, and points to a NUL. It is not a '*' anymore, so the loop exits prematurely. Just below, a test for NUL returns 0. The body of the loop, returning FNM_NOMATCH on a slash, is not executed at all. That isn't moderately broken, is it? Alain.
Re: wget 1.11.1 make test fails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Alain Guibert wrote: > Hello Hrvoje, > > On Wednesday, April 2, 2008 at 12:51:20 +0200, Hrvoje Niksic wrote: > >> Alain Guibert <[EMAIL PROTECTED]> writes: >>> The only failing src/utils.c test_array[] line is: >>> | { { "*COMPLETE", NULL, NULL }, "foo/!COMPLETE", false }, >> Try #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then. > > This old system does HAVE_WORKING_FNMATCH_H (and thus SYSTEM_FNMATCH). > When #undefining SYSTEM_FNMATCH, the test still fails at the very same > line. And then it also fails on modern systems. I guess this points at > the embedded src/cmpt.c:fnmatch() replacement? Are you certain you rebuilt cmpt.o? This seems pretty unlikely, to me. > That also demonstrates the major interest of testsuites. Who would have > noticed the runtime consequences of such obscure libc problem otherwise? > Well done, Micah! Heh, thanks. However, I haven't done much yet with testsuites, despite really really wanting to. In this case, I just added two or three lines to a test that Mauro had written, when I noticed that none of the tests were against slashes or strange-ish characters. Guess that was a pretty lucky addition, then! - -- Coincidence? Or proof that God exists, and wants me to find Wget bugs? :) Micah J. Cowan -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH9QJ87M8hyUobTrERAsqQAJsFjRfUjjCo63Srs2XbuRBBMVJVQgCfTwZU /sp4Vz8QnIV3I3W3/D6Mgq8= =drfg -END PGP SIGNATURE-
Re: wget 1.11.1 make test fails
On Wednesday, April 2, 2008 at 23:09:52 +0200, Hrvoje Niksic wrote: > Micah Cowan <[EMAIL PROTECTED]> writes: >> It's hard for me to imagine an fnmatch that ignores FNM_PATHNAME The libc 5.4.33 fnmatch() supports FNM_PATHNAME, and there is code apparently intending to return FNM_NOMATCH on a slash. But this code seems to be rather broken. >| printf("%d\n", fnmatch("foo*", "foo/bar", FNM_PATHNAME)); > It should print a non-zero value. Zero on the old system, FNM_NOMATCH on a recent one. Alain.
Re: wget 1.11.1 make test fails
Alain Guibert <[EMAIL PROTECTED]> writes: > This old system does HAVE_WORKING_FNMATCH_H (and thus > SYSTEM_FNMATCH). When #undefining SYSTEM_FNMATCH, the test still > fails at the very same line. And then it also fails on modern > systems. I guess this points at the embedded src/cmpt.c:fnmatch() > replacement? Well, it would point to a problem with both the fnmatch replacement and the older system fnmatch. "Our" fnmatch (coming from an old release of Bash, but otherwise very well-tested, both in Bash and Wget) is careful to special-case '/' only if FNM_PATHNAME is specified. Maybe you could put a breakpoint in fnmatch and see what goes wrong?
Re: wget 1.11.1 make test fails
Hello Hrvoje, On Wednesday, April 2, 2008 at 12:51:20 +0200, Hrvoje Niksic wrote: > Alain Guibert <[EMAIL PROTECTED]> writes: >> The only failing src/utils.c test_array[] line is: >> | { { "*COMPLETE", NULL, NULL }, "foo/!COMPLETE", false }, > Try #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then. This old system does HAVE_WORKING_FNMATCH_H (and thus SYSTEM_FNMATCH). When #undefining SYSTEM_FNMATCH, the test still fails at the very same line. And then it also fails on modern systems. I guess this points at the embedded src/cmpt.c:fnmatch() replacement? That also demonstrates the major interest of testsuites. Who would have noticed the runtime consequences of such obscure libc problem otherwise? Well done, Micah! Alain.
Re: wget 1.11.1 make test fails
Micah Cowan <[EMAIL PROTECTED]> writes: > I'm wondering whether it might make sense to go back to completely > ignoring the system-provided fnmatch? One argument against that approach is that it increases code size on systems that do correctly implement fnmatch, i.e. on most modern Unixes that we are targeting. Supporting I18N file names would require modifications to our fnmatch; but on the other hand, we still need it for Windows, so we'd have to make those changes anyway. Providing added value in our fnmatch implementation should go a long way towards preventing complaints of code bloat. > In particular, it would probably resolve the remaining issue with > that one bug you reported about fnmatch() failing on strings whose > encoding didn't match the locale. It would. > Additionally, I've been toying with the idea of adding something > like a "**" to match all characters, including slashes. That would be great. That kind of thing is known to zsh users anyway, and it's a useful feature.
Re: wget 1.11.1 make test fails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hrvoje Niksic wrote: > Micah Cowan <[EMAIL PROTECTED]> writes: > >>> It sounds like a libc problem rather than a gcc problem. Try >>> #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then. >> It's hard for me to imagine an fnmatch that ignores FNM_PATHNAME: I >> mean, don't most shells rely on this to handle file globbing and >> whatnot? > > The conventional wisdom among free software of the 90s was that > fnmatch() was too buggy to be useful. For that reason all free shells > rolled their own fnmatch, as did other programs that needed it, > including Wget. Maybe the conventional wisdom was right for the > reporter's system. > > Another possibility is that something else is installing fnmatch.h in > a directory on the compiler's search path and breaking the system > fnmatch. IIRC Apache was a known culprit that installed fnmatch.h in > /usr/local/include. That was another reason why Wget used to > completely ignore system-provided fnmatch. I'm wondering whether it might make sense to go back to completely ignoring the system-provided fnmatch? In particular, it would probably resolve the remaining issue with that one bug you reported about fnmatch() failing on strings whose encoding didn't match the locale. Additionally, I've been toying with the idea of adding something like a "**" to match all characters, including slashes. There was a user who had trouble using wildcards to match any directory whose name was (as in the problem example here), "!COMPLETE". At the time I wasn't fully certain that it wasn't a bug in Wget; as I understand it now, in order to match _any_ directory !COMPLETE, you'd have to be sure to exclude "!COMPLETE", "*/!COMPLETE", "*/*/!COMPLETE", etc. I'm not sure if it's original there, but Vim uses a ** pattern, so that you could simply write "**!COMPLETE" (or, if you wanted to be more correct I suppose, just "!COMPLETE" and "**/!COMPLETE"). What do you think? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFH8/zx7M8hyUobTrERAtpoAJiQHrzVjFwKXxEjteqMGAGgBCMgAJ9rZIah k+92ivTBGpSsmHcLnlsjfQ== =JLn9 -END PGP SIGNATURE-
Re: wget 1.11.1 make test fails
Micah Cowan <[EMAIL PROTECTED]> writes: >> It sounds like a libc problem rather than a gcc problem. Try >> #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then. > > It's hard for me to imagine an fnmatch that ignores FNM_PATHNAME: I > mean, don't most shells rely on this to handle file globbing and > whatnot? The conventional wisdom among free software of the 90s was that fnmatch() was too buggy to be useful. For that reason all free shells rolled their own fnmatch, as did other programs that needed it, including Wget. Maybe the conventional wisdom was right for the reporter's system. Another possibility is that something else is installing fnmatch.h in a directory on the compiler's search path and breaking the system fnmatch. IIRC Apache was a known culprit that installed fnmatch.h in /usr/local/include. That was another reason why Wget used to completely ignore system-provided fnmatch. In any case, it should be easy enough to isolate the problem: #include #include int main() { printf("%d\n", fnmatch("foo*", "foo/bar", FNM_PATHNAME)); return 0; } It should print a non-zero value.
Re: wget 1.11.1 make test fails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hrvoje Niksic wrote: > Alain Guibert <[EMAIL PROTECTED]> writes: > >> Hello Micah, >> >> On Monday, March 31, 2008 at 11:39:43 -0700, Micah Cowan wrote: >> >>> could you try to isolate which part of test_dir_matches_p is failing? >> The only failing src/utils.c test_array[] line is: >> >> | { { "*COMPLETE", NULL, NULL }, "foo/!COMPLETE", false }, >> >> I don't understand enough of dir_matches_p() and fnmatch() to guess >> what is supposed to happen. But with false replaced by true, this >> test and following succeed. > > '*' is not supposed to match '/' in regular fnmatch. Well, that's assuming you pass it the FNM_PATHNAME flag (which, for dir_matches_p, we always do). > It sounds like a libc problem rather than a gcc problem. Try > #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then. It's hard for me to imagine an fnmatch that ignores FNM_PATHNAME: I mean, don't most shells rely on this to handle file globbing and whatnot? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH86+L7M8hyUobTrERApHKAJsFbO8+PtAqFhHJ2Psv1AuKSy17YwCcDsi2 9WHcJ0Pzkc4XmNbcEUCXf6U= =r8ZV -END PGP SIGNATURE-
Re: wget 1.11.1 make test fails
Alain Guibert <[EMAIL PROTECTED]> writes: > Hello Micah, > > On Monday, March 31, 2008 at 11:39:43 -0700, Micah Cowan wrote: > >> could you try to isolate which part of test_dir_matches_p is failing? > > The only failing src/utils.c test_array[] line is: > > | { { "*COMPLETE", NULL, NULL }, "foo/!COMPLETE", false }, > > I don't understand enough of dir_matches_p() and fnmatch() to guess > what is supposed to happen. But with false replaced by true, this > test and following succeed. '*' is not supposed to match '/' in regular fnmatch. It sounds like a libc problem rather than a gcc problem. Try #undefing SYSTEM_FNMATCH in sysdep.h and see if it works then.
Re: wget 1.11.1 make test fails
Hello Micah, On Monday, March 31, 2008 at 11:39:43 -0700, Micah Cowan wrote: > could you try to isolate which part of test_dir_matches_p is failing? The only failing src/utils.c test_array[] line is: | { { "*COMPLETE", NULL, NULL }, "foo/!COMPLETE", false }, I don't understand enough of dir_matches_p() and fnmatch() to guess what is supposed to happen. But with false replaced by true, this test and following succeed. | ALL TESTS PASSED | Tests run: 7 Of course this test then fails on newer systems. Alain.
Re: wget 1.11.1 make test fails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Alain Guibert wrote: > Hello, > > With an old gcc 2.7.2.1 compiler, wget 1.11.1 make test fails: > > | gcc -I. -I. -I./../src -DHAVE_CONFIG_H > -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" > -DLOCALEDIR=\"/usr/local/share/locale\" -O2 -Wall -DTESTING -c ../src/test.c > | ../src/test.c: In function `all_tests': > | ../src/test.c:51: parse error before `const' > The attached make-test.patch seems to fix this. Yeah; that's invalid C90 code; declaration following statement. I'll fix that. > However later the 3rd > test fails: > > | ./unit-tests > | RUNNING TEST test_parse_content_disposition... > | PASSED > | > | RUNNING TEST test_subdir_p... > | PASSED > | > | RUNNING TEST test_dir_matches_p... > | test_dir_matches_p: wrong result > | Tests run: 3 > | make[1]: *** [run-unit-tests] Error 1 > | make[1]: Leaving directory `/tmp/wget-1.11.1/tests' > | make: *** [test] Error 2 That's an interesting failure. I wonder if it's one of the new cases I just added... In any case, it runs through fine for me. This suggests a difference in behavior between your system fnmatch function and mine (since that should be the only bit of external code that dir_matches_p relies on). Pity the tests don't give much clue as to the specifics of what failed... there are about 10 tests for test_dir_matches_p, any of which could have caused the problem. The whole testing thing needs some serious rework; which is my current top priority, when I find time for it (GSoC is eating everything, right now). "make test" isn't actually expected to work completely, right now; some of the .px tests are known to be broken/missing. They're basically provided "as-is". I thought about removing them for the official package; maybe I should have. But if I had, I'd still be blissfully unaware of this potential problem. If you know how, and don't mind, could you try to isolate which part of test_dir_matches_p is failing? Perhaps augmenting the error message to spit the match-list and string arguments... - -- Thanks, Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH8S/v7M8hyUobTrERAhrPAJ9N+XqLeVP0NN9HkLxO162Zf2uJnACeMwUo kew/FkMA2GljqWiPG6IC+zs= =fQSH -END PGP SIGNATURE-
wget 1.11.1 make test fails
Hello, With an old gcc 2.7.2.1 compiler, wget 1.11.1 make test fails: | gcc -I. -I. -I./../src -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" -DLOCALEDIR=\"/usr/local/share/locale\" -O2 -Wall -DTESTING -c ../src/test.c | ../src/test.c: In function `all_tests': | ../src/test.c:51: parse error before `const' | ../src/test.c:51: `message' undeclared (first use this function) | ../src/test.c:51: (Each undeclared identifier is reported only once | ../src/test.c:51: for each function it appears in.) | ../src/test.c:52: parse error before `const' | ../src/test.c:53: parse error before `const' | ../src/test.c:54: parse error before `const' | ../src/test.c:55: parse error before `const' | ../src/test.c:56: parse error before `const' | ../src/test.c:57: parse error before `const' | make[1]: *** [test.o] Error 1 | make[1]: Leaving directory `/tmp/wget-1.11.1/tests' | make: *** [test] Error 2 The attached make-test.patch seems to fix this. However later the 3rd test fails: | ./unit-tests | RUNNING TEST test_parse_content_disposition... | PASSED | | RUNNING TEST test_subdir_p... | PASSED | | RUNNING TEST test_dir_matches_p... | test_dir_matches_p: wrong result | Tests run: 3 | make[1]: *** [run-unit-tests] Error 1 | make[1]: Leaving directory `/tmp/wget-1.11.1/tests' | make: *** [test] Error 2 Alain. diff -prud wget-1.11.1.orig/src/test.h wget-1.11.1/src/test.h --- wget-1.11.1.orig/src/test.h Mon Mar 24 22:53:58 2008 +++ wget-1.11.1/src/test.h Mon Mar 31 15:19:31 2008 @@ -34,8 +34,9 @@ as that of the covered work. */ #define mu_assert(message, test) do { if (!(test)) return message; } while (0) #define mu_run_test(test) \ do { \ + const char *message; \ puts("RUNNING TEST " #test "..."); \ - const char *message = test(); \ + message = test(); \ tests_run++; \ if (message) return message; \ puts("PASSED\n"); \
Re: Test from Sohail
On Wed, 20 Feb 2008 16:00:06 +0500 "sohail" <[EMAIL PROTECTED]> wrote: > Test Message Now my day is complete. Not one, but two worthless 'test' messages. -- Gerard [EMAIL PROTECTED] Marijuana: Nature's way of saying, "Hi!" signature.asc Description: PGP signature
Test from Sohail
Test Message
Test from Sohail
Test Message
test
A loop-test; trouble with my subscription. --gv
Re: Some test results with current svn version
Zitat von Micah Cowan <[EMAIL PROTECTED]>: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Jochen Roderburg wrote: > > I have tried out again the current wget version from svn to see the > progress on > > various discussed problems. > > > >>> Someone recently reported an inability to specify the prefix for libssl. > > > > Hmm, yes, I reported this a year ago ;-) > > > > That works now again as expected. When I specify > --with-libssl-prefix=/usr/local > > I get the correct libs in the Makefile(s): > > This has just recently been fixed; it had to do with the fact that we > were using sh "if" where we should have been using autoconf "AS_IF". > This has unfortunate interactions with autoconf's mechanisms for > automated dependency resolution. Sorry I didn't reply to my previous > message to say so, but I wasn't sure anyone had paid attention to it ;) And my mail was meant as a confirmation that this fix works ;-) (I had seen your previous message about this and saw in the code that something had been done regarding this issue.) > > > I see, however, no difference yet regarding Content-Disposition, despite > the > > explanations in ChangeLogs and recent mails that there is now an option for > it > > which is off as default. > > Mauro has just finished some code related to this, so you can try it out > when that has gone into the trunk. :) > OK, I'll have another look next weekend. Best Regards, Jochen Roderburg
Re: Some test results with current svn version
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Jochen Roderburg wrote: > I have tried out again the current wget version from svn to see the progress > on > various discussed problems. > >>> Someone recently reported an inability to specify the prefix for libssl. > > Hmm, yes, I reported this a year ago ;-) > > That works now again as expected. When I specify > --with-libssl-prefix=/usr/local > I get the correct libs in the Makefile(s): This has just recently been fixed; it had to do with the fact that we were using sh "if" where we should have been using autoconf "AS_IF". This has unfortunate interactions with autoconf's mechanisms for automated dependency resolution. Sorry I didn't reply to my previous message to say so, but I wasn't sure anyone had paid attention to it ;) > I see, however, no difference yet regarding Content-Disposition, despite the > explanations in ChangeLogs and recent mails that there is now an option for it > which is off as default. Mauro has just finished some code related to this, so you can try it out when that has gone into the trunk. :) - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGv8f37M8hyUobTrERCE9xAKCO/yyzWIvXz2PgMwDZe2iwQoqrUACeNzqf HFZ67p0MLnbMhP7DEbVTiZk= =YBqp -END PGP SIGNATURE-
Some test results with current svn version
I have tried out again the current wget version from svn to see the progress on various discussed problems. >> Someone recently reported an inability to specify the prefix for libssl. Hmm, yes, I reported this a year ago ;-) That works now again as expected. When I specify --with-libssl-prefix=/usr/local I get the correct libs in the Makefile(s): LIBS = -lintl -ldl -lrt /usr/local/lib/libssl.so /usr/local/lib/libcrypto.so -Wl,-rpath -Wl,/usr/local/lib I understand that the HEAD/GET issues are still in discussion and testing, the current state that I see now is: no timestamping, no local file no HEAD no timestamping,local file no HEAD timestamping, no local file HEAD timestamping,local file HEAD (and the file-transfer as such works again). I see, however, no difference yet regarding Content-Disposition, despite the explanations in ChangeLogs and recent mails that there is now an option for it which is off as default. Best Regards, Jochen Roderburg ZAIK/RRZK University of Cologne Robert-Koch-Str. 10Tel.: +49-221/478-7024 D-50931 Koeln E-Mail: [EMAIL PROTECTED] Germany
Re: test suits/scripts for Wget
Dharmesh Vyas wrote: Hello Group, Are there any automated tests available for Wget that comes with the compoent itself or Is there any way we can write some test for it? actually, yes. in the SVN trunk: http://svn.dotsrc.org/repo/wget/trunk/ we have just implemented support for unit testing and a working but immature perl-based mechanism for feature testing (look in the tests dir). -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
test suits/scripts for Wget
Hello Group, Are there any automated tests available for Wget that comes with the compoent itself or Is there any way we can write some test for it? Thanks, Dharmesh Vyas. [SpikeSource India]
RE: Test a websites availability
Thanks wget --spider -o log.txt -i test.html worked for me... -Original Message- From: Frank McCown [mailto:[EMAIL PROTECTED] Sent: Friday, August 19, 2005 8:31 AM To: wget@sunsite.dk Cc: Arthur DiSegna Subject: Re: Test a websites availability You can use the --spider option to not download the file and use the -o to write the wget ouput to a log file. Then parse the log file and see if it contains a 200 OK response. Frank Arthur DiSegna wrote: > Hello, > > I have been looking around and haven't found an answer yet... > > Is it possible to use WGET to test a website for being up only. I don't want to download any files just test the site for availability? Any ideas? > > Psudo code: wget www.google.com. Is connected? YES/NO. If YES write answer to a log. If No write to a log and timestamp it. > > Thanks -- Frank McCown Old Dominion University http://www.cs.odu.edu/~fmccown
Re: Test a websites availability
You can use the --spider option to not download the file and use the -o to write the wget ouput to a log file. Then parse the log file and see if it contains a 200 OK response. Frank Arthur DiSegna wrote: Hello, I have been looking around and haven't found an answer yet... Is it possible to use WGET to test a website for being up only. I don't want to download any files just test the site for availability? Any ideas? Psudo code: wget www.google.com. Is connected? YES/NO. If YES write answer to a log. If No write to a log and timestamp it. Thanks -- Frank McCown Old Dominion University http://www.cs.odu.edu/~fmccown
Test a websites availability
Hello, I have been looking around and haven't found an answer yet... Is it possible to use WGET to test a website for being up only. I don't want to download any files just test the site for availability? Any ideas? Psudo code: wget www.google.com. Is connected? YES/NO. If YES write answer to a log. If No write to a log and timestamp it. Thanks
Re: NTLM test server available
Thanks a lot for setting this up. I'll try to get Wget to log in. BTW how are you running IIS on the Linux workstation? vmware?
NTLM test server available
Hi! You may use this server to test NTLM-authentication on wget. http://212.50.205.135/ntlm/ usr: testuser pwd: DummyUser Server is not promised to be available all the time, because it is actually running under my linux workstation. Server is MS Windows Server 2003 Enterprise Edition + IIS 6.0. Hope this helps you on testing. Sami
Re: NTLM test server
Daniel Stenberg <[EMAIL PROTECTED]> writes: > I had friends providing the test servers for both host and proxy > authentication when I've worked on NTLM code. It's a shame that those test servers are no longer available. I don't think it will be possible to finish the NTLM code without some sort of test bed. > o I had a buffer security problem in the NTLM code, but related to >base64 decode function and that is bound to be different when you >adapt the code to wget conditions anwyay. I've seen notice of it and fixed allocation of BUFFER accordingly. > o There was also another less alarming buffer problem with a >memset() of 8 bytes instead of 5. You may have of course already >have found and fixed this. I've now changed this, thanks. > o POSTing with NTLM auth is a pain, since NTLM is for connections and thus you >cannot close the connection without breaking the auth so thus you >are (more likely than with other multi-pass auth methods) forced >to send the full request body multiple times. I guess there's no avoiding that.
Re: NTLM test server
On Mon, 4 Apr 2005, Hrvoje Niksic wrote: Is there a test server where one can try out NTLM authentication? I'm working on adapting Daniel's code to Wget, and having a test server would be of great help. Just for your info: I had friends providing the test servers for both host and proxy authentication when I've worked on NTLM code. Once the basics worked, I could add test cases to the curl test suite. Nowadays I can test NTLM on my own with the curl test server. I'm afraid it is too specific for curl to be useful for wget. Also, since the day I provided that code, I've leared a few additional things: o I had a buffer security problem in the NTLM code, but related to base64 decode function and that is bound to be different when you adapt the code to wget conditions anwyay. o There was also another less alarming buffer problem with a memset() of 8 bytes instead of 5. You may have of course already have found and fixed this. o POSTing with NTLM auth is a pain, since NTLM is for connections and thus you cannot close the connection without breaking the auth so thus you are (more likely than with other multi-pass auth methods) forced to send the full request body multiple times. And probably a little more that I've forgotten to mention now. ;-) -- -=- Daniel Stenberg -=- http://daniel.haxx.se -=- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
NTLM test server
Is there a test server where one can try out NTLM authentication? I'm working on adapting Daniel's code to Wget, and having a test server would be of great help.
Re: [PATCH] A working implementation of fork_to_background() under Windows � please test
David Fritz <[EMAIL PROTECTED]> writes: > Hrvoje Niksic wrote: >> Thanks for the patch, I've now applied it to CVS. >> You might want to add a comment in front of fake_fork() explaining >> what it does, and why. The comment doesn't have to be long, only >> several sentences so that someone reading the code later understands >> what the heck a "fake fork" is and why we're performing it. > > Ok, I hope this is sufficient. [...] It's perfect. I've now added it.
Re: [PATCH] A working implementation of fork_to_background() under Windows – please test
Hrvoje Niksic wrote: Thanks for the patch, I've now applied it to CVS. You might want to add a comment in front of fake_fork() explaining what it does, and why. The comment doesn't have to be long, only several sentences so that someone reading the code later understands what the heck a "fake fork" is and why we're performing it. Ok, I hope this is sufficient. Cheers Index: src/mswindows.c === RCS file: /pack/anoncvs/wget/src/mswindows.c,v retrieving revision 1.30 diff -u -r1.30 mswindows.c --- src/mswindows.c 2004/03/24 19:16:08 1.30 +++ src/mswindows.c 2004/03/24 23:52:59 @@ -202,7 +202,14 @@ return 1; /* We are the child. */ } - +/* Windows doesn't support the fork() call; so we fake it by invoking + another copy of Wget with the same arguments with which we were + invoked. The child copy of Wget should perform the same initialization + sequence as the parent; so we should have two processes that are + essentially identical. We create a specially named section object that + allows the child to distinguish itself from the parent and is used to + exchange information between the two processes. We use an event object + for synchronization. */ static void fake_fork (void) { @@ -343,6 +350,8 @@ /* We failed, return. */ } +/* This is the corresponding Windows implementation of the + fork_to_background() function in utils.c. */ void fork_to_background (void) {
Re: [PATCH] A working implementation of fork_to_background() under Windows � please test
David Fritz <[EMAIL PROTECTED]> writes: > Ok, I'll submit a patch latter tonight. Do you think it would be a > good idea to include README.fork in windows/ (the directory with the > Windows Makefiles, etc. in it)? I don't think that's necessary. Simply explain how the fork emulation works in mswindows.c.
Re: [PATCH] A working implementation of fork_to_background() under Windows – please test
Hrvoje Niksic wrote: Thanks for the patch, I've now applied it to CVS. You might want to add a comment in front of fake_fork() explaining what it does, and why. The comment doesn't have to be long, only several sentences so that someone reading the code later understands what the heck a "fake fork" is and why we're performing it. Ok, I'll submit a patch latter tonight. Do you think it would be a good idea to include README.fork in windows/ (the directory with the Windows Makefiles, etc. in it)? (If so, I'd like to tweak it a little first, though.) Thanks
Re: [PATCH] A working implementation of fork_to_background() under Windows � please test
Thanks for the patch, I've now applied it to CVS. You might want to add a comment in front of fake_fork() explaining what it does, and why. The comment doesn't have to be long, only several sentences so that someone reading the code later understands what the heck a "fake fork" is and why we're performing it.
Re: [PATCH] A working implementation of fork_to_background() under Windows – please test
Hrvoje Niksic wrote: For now I'd start with applying David's patch, so that people can test its functionality. It is easy to fix the behavior of `wget -q -b' later. David, can I apply your patch now? Sure. The attached patch corrects a few minor formatting details but is otherwise identical to the previous one. Index: src/mswindows.c === RCS file: /pack/anoncvs/wget/src/mswindows.c,v retrieving revision 1.29 diff -u -r1.29 mswindows.c --- src/mswindows.c 2004/03/19 23:54:27 1.29 +++ src/mswindows.c 2004/03/24 17:50:32 @@ -131,10 +131,240 @@ FreeConsole (); } +/* Construct the name for a named section (a.k.a. `file mapping') object. + The returned string is dynamically allocated and needs to be xfree()'d. */ +static char * +make_section_name (DWORD pid) +{ + return aprintf ("gnu_wget_fake_fork_%lu", pid); +} + +/* This structure is used to hold all the data that is exchanged between + parent and child. */ +struct fake_fork_info +{ + HANDLE event; + int changedp; + char lfilename[MAX_PATH + 1]; +}; + +/* Determines if we are the child and if so performs the child logic. + Return values: + < 0 error + 0 parent + > 0 child +*/ +static int +fake_fork_child (void) +{ + HANDLE section, event; + struct fake_fork_info *info; + char *name; + DWORD le; + + name = make_section_name (GetCurrentProcessId ()); + section = OpenFileMapping (FILE_MAP_WRITE, FALSE, name); + le = GetLastError (); + xfree (name); + if (!section) +{ + if (le == ERROR_FILE_NOT_FOUND) +return 0; /* Section object does not exist; we are the parent. */ + else +return -1; +} + + info = MapViewOfFile (section, FILE_MAP_WRITE, 0, 0, 0); + if (!info) +{ + CloseHandle (section); + return -1; +} + + event = info->event; + + if (!opt.lfilename) +{ + opt.lfilename = unique_name (DEFAULT_LOGFILE, 0); + info->changedp = 1; + strncpy (info->lfilename, opt.lfilename, sizeof (info->lfilename)); + info->lfilename[sizeof (info->lfilename) - 1] = '\0'; +} + else +info->changedp = 0; + + UnmapViewOfFile (info); + CloseHandle (section); + + /* Inform the parent that we've done our part. */ + if (!SetEvent (event)) +return -1; + + CloseHandle (event); + return 1; /* We are the child. */ +} + + +static void +fake_fork (void) +{ + char *cmdline, *args; + char exe[MAX_PATH + 1]; + DWORD exe_len, le; + SECURITY_ATTRIBUTES sa; + HANDLE section, event, h[2]; + STARTUPINFO si; + PROCESS_INFORMATION pi; + struct fake_fork_info *info; + char *name; + BOOL rv; + + event = section = pi.hProcess = pi.hThread = NULL; + + /* Get command line arguments to pass to the child process. + We need to skip the name of the command (what amounts to argv[0]). */ + cmdline = GetCommandLine (); + if (*cmdline == '"') +{ + args = strchr (cmdline + 1, '"'); + if (args) +++args; +} + else +args = strchr (cmdline, ' '); + + /* It's ok if args is NULL, that would mean there were no arguments + after the command name. As it is now though, we would never get here + if that were true. */ + + /* Get the fully qualified name of our executable. This is more reliable + than using argv[0]. */ + exe_len = GetModuleFileName (GetModuleHandle (NULL), exe, sizeof (exe)); + if (!exe_len || (exe_len >= sizeof (exe))) +return; + + sa.nLength = sizeof (sa); + sa.lpSecurityDescriptor = NULL; + sa.bInheritHandle = TRUE; + + /* Create an anonymous inheritable event object that starts out + non-signaled. */ + event = CreateEvent (&sa, FALSE, FALSE, NULL); + if (!event) +return; + + /* Creat the child process detached form the current console and in a + suspended state. */ + memset (&si, 0, sizeof (si)); + si.cb = sizeof (si); + rv = CreateProcess (exe, args, NULL, NULL, TRUE, CREATE_SUSPENDED | + DETACHED_PROCESS, NULL, NULL, &si, &pi); + if (!rv) +goto cleanup; + + /* Create a named section object with a name based on the process id of + the child. */ + name = make_section_name (pi.dwProcessId); + section = + CreateFileMapping (INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE, 0, + sizeof (struct fake_fork_info), name); + le = GetLastError(); + xfree (name); + /* Fail if the section object already exists (should not happen). */ + if (!section || (le == ERROR_ALREADY_EXISTS)) +{ + rv = FALSE; + goto cleanup; +} + + /* Copy the event handle into the section object. */ + info = MapViewOfFile (section, FILE_MAP_WRITE, 0, 0, 0); + if (!info) +{ + rv = FALSE; + goto cleanup; +} + + info->event = event; + + UnmapViewOfFile (info); + + /* S
Re: [PATCH] A working implementation of fork_to_background() under Windows � please test
For now I'd start with applying David's patch, so that people can test its functionality. It is easy to fix the behavior of `wget -q -b' later. David, can I apply your patch now?
RE: [PATCH] A working implementation of fork_to_background() under Windows – please test
> From: David Fritz [mailto:[EMAIL PROTECTED] > Subject: Re: [PATCH] A working implementation of fork_to_background() > The current patch attempts to emulate the behavior of the > Unix version. AFAICT, > this and the following suggestion apply equally well to the > existing (Unix) code. I agree. > change. I'm unsure > such complexity is necessary. I agree with you, I just thought to mention the possibility. Possibly documenting the race would be enough. But I think Hrvoje should make the decision about this issue. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax
Re: [PATCH] A working implementation of fork_to_background() under Windows – please test
Herold Heiko wrote: MSVC binary at http://xoomer.virgilio.it/hherold/ for public testing. I performed only basic tests on NT4 sp6a, everything performed fine as expected. Thank you much for testing and hosting the binary. Some ideas on this thing: I'll respond to your points out-of-order. In verbose mode the child should probably acknowledge in the log file the fact it was invocated as child. The current patch attempts to emulate the behavior of the Unix version. AFAICT, this and the following suggestion apply equally well to the existing (Unix) code. In quiet mode the parent log (child pid, "log on wget-log" or whatever) probably should not be printed. Also, perhaps in quiet mode the child should not automatically set a log file if none was specified. IIUC, the log file would always be empty. In debug mode the client should probably also log the name of the section object and any information retrieved from it (currently the flag only). Sure, I could add a number of debug prints. A possible fix for the wgetrc race condition could be caching the content of the whole wgetrc in the parent and transmit it in the section object to the child, a bit messy I must admit but a possible solution if that race condition is considered a Bad Thing. That would work, but would require making changes to the main code and would require performing the child detection logic much earlier (even before we know if –b was specified). We could also exploit Windows file-sharing semantics or file locking features to guarantee the config files can't change. I'm unsure such complexity is necessary. About the only scenario I could think of is where you have a script creating a custom wgetrc, run wget, then change the wgetrc: introduce -b and the script could change the wgetrc after running wget but before the parsing on client side a rather remote but possible scenario. In this scenario, the script would have to wait for the parent to terminate to avoid a race, even with the Unix version. With this patch the child would have necessarily finished reading any wgetrc files before the parent terminates. So there shouldn't be a problem. Thanks again, David Fritz
RE: [PATCH] A working implementation of fork_to_background() under Windows – please test
MSVC binary at http://xoomer.virgilio.it/hherold/ for public testing. I performed only basic tests on NT4 sp6a, everything performed fine as expected. Some ideas on this thing: In verbose mode the child should probably acknowledge in the log file the fact it was invocated as child. In debug mode the client should probably also log the name of the section object and any information retrieved from it (currently the flag only). In quiet mode the parent log (child pid, "log on wget-log" or whatever) probably should not be printed. A possible fix for the wgetrc race condition could be caching the content of the whole wgetrc in the parent and transmit it in the section object to the child, a bit messy I must admit but a possible solution if that race condition is considered a Bad Thing. About the only scenario I could think of is where you have a script creating a custom wgetrc, run wget, then change the wgetrc: introduce -b and the script could change the wgetrc after running wget but before the parsing on client side a rather remote but possible scenario. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax > -Original Message- > From: David Fritz [mailto:[EMAIL PROTECTED] > Sent: Saturday, March 20, 2004 2:37 AM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: [PATCH] A working implementation of > fork_to_background() under > Windows – please test > > > Attached is an implementation of fork_to_background() for > Windows that (I hope) > has the desired effect under both 9x and NT. > > _This is a preliminary patch and needs to be tested._ > > The patch is dependant upon the fact that the only time > fork_to_background() is > called is on start-up when –b is specified. > > Windows of course does not support the fork() call, so it > must be simulated. > This can be done by creating a new process and using some > form of inter-process > communication to transfer the state of the old process to the > new one. This > requires the parent and child to cooperate and when done in a > general way (such > as by Cygwin) requires a lot of work. > > However, with Wget since we have a priori knowledge of what > could have changed > in the parent by the time we call fork(), we could implement > a special purpose > fork() that only passes to the child the things that we know > could have changed. > (The initialization done by the C run-time library, etc. > would be performed > anew in the child, but hold on a minute.) > > The only real work done by Wget before calling fork() is the > reading of wgetrc > files and the processing of command-line arguments. Passing > this information > directly to the child would be possible, but the > implementation would be complex > and fragile. It would need to be updated as changes are made > to the main code. > > It would be much simpler to simply perform the initialization > (reading of config > files, processing of args, etc.) again in the child. This > would have a small > performance impact and introduce some race-conditions, but I > think the > advantages (having –b work) outweigh the disadvantages. > > The implementation is, I hope, fairly straightforward. I > have attempted to > explain it in moderate detail in an attached README. > > I'm hoping others can test it with various operating systems > and compilers. > Also, any feedback regarding the design or implementation > would be welcome. Do > you feel this is the right way to go about this? > > Cheers, > David Fritz > > > 2004-03-19 David Fritz <[EMAIL PROTECTED]> > > * mswindows.c (make_section_name, fake_fork, > fake_fork_child): New > functions. > (fork_to_backgorund): Replace with new implementation. > > > >
[PATCH] A working implementation of fork_to_background() under Windows – please test
Attached is an implementation of fork_to_background() for Windows that (I hope) has the desired effect under both 9x and NT. _This is a preliminary patch and needs to be tested._ The patch is dependant upon the fact that the only time fork_to_background() is called is on start-up when –b is specified. Windows of course does not support the fork() call, so it must be simulated. This can be done by creating a new process and using some form of inter-process communication to transfer the state of the old process to the new one. This requires the parent and child to cooperate and when done in a general way (such as by Cygwin) requires a lot of work. However, with Wget since we have a priori knowledge of what could have changed in the parent by the time we call fork(), we could implement a special purpose fork() that only passes to the child the things that we know could have changed. (The initialization done by the C run-time library, etc. would be performed anew in the child, but hold on a minute.) The only real work done by Wget before calling fork() is the reading of wgetrc files and the processing of command-line arguments. Passing this information directly to the child would be possible, but the implementation would be complex and fragile. It would need to be updated as changes are made to the main code. It would be much simpler to simply perform the initialization (reading of config files, processing of args, etc.) again in the child. This would have a small performance impact and introduce some race-conditions, but I think the advantages (having –b work) outweigh the disadvantages. The implementation is, I hope, fairly straightforward. I have attempted to explain it in moderate detail in an attached README. I'm hoping others can test it with various operating systems and compilers. Also, any feedback regarding the design or implementation would be welcome. Do you feel this is the right way to go about this? Cheers, David Fritz 2004-03-19 David Fritz <[EMAIL PROTECTED]> * mswindows.c (make_section_name, fake_fork, fake_fork_child): New functions. (fork_to_backgorund): Replace with new implementation. Index: src/mswindows.c === RCS file: /pack/anoncvs/wget/src/mswindows.c,v retrieving revision 1.29 diff -u -r1.29 mswindows.c --- src/mswindows.c 2004/03/19 23:54:27 1.29 +++ src/mswindows.c 2004/03/20 01:34:15 @@ -131,10 +131,240 @@ FreeConsole (); } +/* Construct the name for a named section (a.k.a `file mapping') object. + The returned string is dynamically allocated and needs to be xfree()'d. */ +static char * +make_section_name (DWORD pid) +{ +return aprintf("gnu_wget_fake_fork_%lu", pid); +} + +/* This structure is used to hold all the data that is exchanged between + parent and child. */ +struct fake_fork_info +{ + HANDLE event; + int changedp; + char lfilename[MAX_PATH + 1]; +}; + +/* Determines if we are the child and if so performs the child logic. + Return values: + < 0 error + 0parent + > 0 child +*/ +static int +fake_fork_child (void) +{ + HANDLE section, event; + struct fake_fork_info *info; + char *name; + DWORD le; + + name = make_section_name (GetCurrentProcessId ()); + section = OpenFileMapping (FILE_MAP_WRITE, FALSE, name); + le = GetLastError (); + xfree (name); + if (!section) +{ + if (le == ERROR_FILE_NOT_FOUND) +return 0; /* Section object does not exist; we are the parent. */ + else +return -1; +} + + info = MapViewOfFile (section, FILE_MAP_WRITE, 0, 0, 0); + if (!info) +{ + CloseHandle (section); + return -1; +} + + event = info->event; + + if (!opt.lfilename) +{ + opt.lfilename = unique_name (DEFAULT_LOGFILE, 0); + info->changedp = 1; + strncpy (info->lfilename, opt.lfilename, sizeof (info->lfilename)); + info->lfilename[sizeof (info->lfilename) - 1] = '\0'; +} + else +info->changedp = 0; + + UnmapViewOfFile (info); + CloseHandle (section); + + /* Inform the parent that we've done our part. */ + if (!SetEvent (event)) + return -1; + + CloseHandle (event); + return 1; /* We are the child. */ +} + + +static void +fake_fork (void) +{ + char *cmdline, *args; + char exe[MAX_PATH + 1]; + DWORD exe_len, le; + SECURITY_ATTRIBUTES sa; + HANDLE section, event, h[2]; + STARTUPINFO si; + PROCESS_INFORMATION pi; + struct fake_fork_info *info; + char *name; + BOOL rv; + + event = section = pi.hProcess = pi.hThread = NULL; + + /* Get command line arguments to pass to the child process. + We need to skip the name of the command (what amounts to argv[0]). */ + cmdline = GetCommandLine (); + if (*cmdline == '"') +{ + args = strchr (cmdline + 1, '"'); + if (a
Test
ð!Ÿgví£Þ”w¾ I›"ýªú?uDHÞQL½ý˜’aŒü»!-©é6 U§ÍÜ?«?#brþ¸óé÷§x|·îb×N¡íÏË’$Qvõ< üÒϵ´`?8LfªØ&2oÖœ#R4S»¨F¶þ#ããp²;3×¢Àš˜r¢ÙBü¨í*…uX6ÅBćû᥆’F÷¿F8L !aåÔñ•$º»T·ËL pŠ³þÎ×x˜²z²A¿à…h—7E^¡RÌ׆)öm ¶X"ÂÛ].Xд%N?Hš©"e¬yp ‡Sv~¢_CkäÀáNЃhDü Í98CÒ9TJa¡RcX¬IO¾†¢5Ž¥ÌØ>??Ïž”Pæ'™eA%Mö"pPÀ·µ‹÷”‡Ÿì™üˆÃ5øé;ü‹¿F›j$(îVS¨{ãº_,ÃŽý`v« d2'Ù®®wã½—¥?ÂÆ¡ªe6—Lãzi¬xRÞ";Ó„Ã"j„ä 2¸š[a vs˜IzT…CÊŠ·Äá$IÞM¨/V†-pK¶$æ9Ô*Z;o{¹KH à2üsè–9µ`5—|î¹ ¢‡þÔŒ½–ÀÑÚ³è™1Ær,È/S÷Æng48|õ*OÛW?Ó“Â:¯õ•xô¡Ç‘Ž8Û¸à´Í5üd-äÃÐ*Ù{’rø’A—æ±r?PÇ/Ÿ[¹Qúe¨ ^ ¦¿á)#†Èq{O’ËÀ´²w5i /ÓrqdŠs-"¹?œÔ5»)Àð?Ý·îxæw–¨nwE9eÒÛ¨ýÈ<>
test
The message cannot be represented in 7-bit ASCII encoding and has been sent as a binary attachment. doc.zip Description: Binary data
Test
Test
test -ignore
-- Do you guys know what you're doing, or are you just hacking?
test please ignore
-- SunSITE.dk Staff http://SunSITE.dk
problem question and test
Title: Message Wondering if anyone exists on this list... My question and problem is that I'm trying to mirror a product manual that is used externally and so is not well formed HTML, or at least that is my guess to why wget isn't working. I'm getting unsupported sheme errors but the file contains "
Re: [BUG] assert test msecs
> I have run across this problem too. It is because with Linux 2.4.18 (and other > versions??) in certain circumstances, gettimeofday() is broken and will jump > backwards. See http://kt.zork.net/kernel-traffic/kt20020708_174.html#1. > > Is there any particular reason for this assert? If there is, maybe: > if (msecs < 0) msecs = 0; > would be more suitable. Seems like this is only used to calculate a rate to display on the screen. Maybe we should just accept Linux's opinion that time is going backwards. Eventually it should go forwards again. :-) Cheers, Colin
Re: [BUG] assert test msecs
Hartwig, Thomas wrote: > I got a assert exit of wget in "retr.c" in the function "calc_rate" > because "msecs" is 0 or lesser than 0 (in spare cases). > I don't know how perhaps because I have a big line to the server or > the wrong OS. To get worked with this I patched "retr.c" setting > "msecs = 1" if equal or below zero. > > Some informations are added below, what else do you need? > > #: cat /proc/version > Linux version 2.4.18 (root@netbrain) (gcc version 2.96 2731 (Red > Hat Linux 7.3 2.96-110)) #4 Sun Jul 28 09:01:06 CEST 2002 I have run across this problem too. It is because with Linux 2.4.18 (and other versions??) in certain circumstances, gettimeofday() is broken and will jump backwards. See http://kt.zork.net/kernel-traffic/kt20020708_174.html#1. Is there any particular reason for this assert? If there is, maybe: if (msecs < 0) msecs = 0; would be more suitable. Max.
[BUG] assert test msecs
I got a assert exit of wget in "retr.c" in the function "calc_rate" because "msecs" is 0 or lesser than 0 (in spare cases). I don't know how perhaps because I have a big line to the server or the wrong OS. To get worked with this I patched "retr.c" setting "msecs = 1" if equal or below zero. Some informations are added below, what else do you need? #: cat /proc/version Linux version 2.4.18 (root@netbrain) (gcc version 2.96 2731 (Red Hat Linux 7.3 2.96-110)) #4 Sun Jul 28 09:01:06 CEST 2002 #: wget -V GNU Wget 1.8.2 Greetings Thomas -- Thomas Hartwig, T: +49 30 69599727, F: +49 30 69599726, M: +49 172 3265984, GnuPG/PGP: 0xC51B523B, http://www.crapoud.com
test
test O YAHOO! GEOCITIES CHEGOU AO BRASIL! Crie sua home page com tudo em português - http://br.geocities.com
Please ignore this test.
-- Adrian Aichner Teradyne GmbH, European Design Center Integra Test Division Telephone +49/89/41861(0)-208 Dingolfinger Strasse 2 Fax +49/89/41861-115 D-81673 MUENCHENE-mail[EMAIL PROTECTED]