-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ryan Schmidt wrote: > Hello, > > I was advised last year that the wget test suite should not be used, and > to keep an eye on the file tests/README to see if that changed: > > http://marc.info/?l=wget&m=121205151029359&w=2 > > That file no longer exists in the wget 1.12 distribution and upon trying > "make check" on Mac OS X 10.6.1 I see most of the tests pass. Is the > test suite something that should be used now? I didn't see it mentioned > in the NEWS or ChangeLog files.
Yes, the tests are hoped to be reliable, on Unix-like systems that provide the necessary prerequisites. > The tests that are failing for me are all the IRI ones. The IDN tests > and all the others are passing fine. I have libidn 1.15 installed. > Attached is the log of the failed tests. - - Test-ftp-iri.px: Despite the fact that Wget reports having saved a file with byte values "fran\303\247ais.txt", the test apparently finds one with byte values "franc\314\247ais.txt". It would seem to have been "normalized", replacing the single Unicode character "lowercase c with cedilla" with "c" followed by the combining-character "cedilla". Annoyingly, gnome-terminal erroneously displays this as francąis, instead of français (though gvim displays it correctly). It took a little trouble to verify it was the right sequence. So the trouble seems to be that Wget's "franc\314\247ais.txt" gets normalized before storing in the filesystem. Perhaps this is something that Mac OS does, itself? - - Test-ftp-iri-fallback.px, Test-ftp-iri-recursive.px, and several others. The two named above have the wrong information about what their test name is. run-px prints this information anyway, so I should probably remove the "name" settings within the tests themselves (I never liked them in the first place: redundant information begs for mistakes like this). A large number of tests fail because Wget saves a file with direct bytes for latin1 encodings, and then Wget finds a file back with those bytes URL-encoded. I don't believe that Wget is doing this encoding; though it would if it had "restrict_file_names" set to "ascii". But if that were the case, Wget would announce it was saving 'fran%E7ais.txt', not 'fran\347ais.txt'. Looks like it's another automatic transcoding by the operating system. ...There's also an interesting message about an uninitialized string on FTPServer.pm:251. I can reproduce that one, so I'll look into it. . It looks like the troubles your experiencing are due to the fact that the Wget tests assume that the filesystem can take any arbitrary set of bytes, and will store them as they were given. This is apparently not the case for Mac OS, and I should probably have known better than to assume it would be a universal case. Ryan, can you please confirm for me that this is indeed what's happening? I'll have to rework the tests, so that they don't make these assumptions, at least on systems that can't handle them. But the good new is that this appears to be a problem with the tests, and not with Wget; IRIs should work fine, provided we don't use the --local-encoding option to lie about what encoding is acceptable locally. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. Maintainer of GNU Wget and GNU Teseq http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkq7syAACgkQ7M8hyUobTrHWGQCdG4dN3quV+KChxDE61KMmxKz1 FrAAn3spxTZXkf9P4waWETYkIluZRevI =fhas -----END PGP SIGNATURE-----
