I'm sorry, but you haven't actually demonstrated that there's a flaw in
Wget.
First off, there are a few ways to format PDFs (especially,
linearization)) that will allow them to be read even if the file wasn't
completely downloaded (perhaps a page will be missing, perhaps an
element on a page will be missing). With only 73 bytes missing, some PDF
readers (such as Adobe's) will actually just rebuild the cross-reference
table that goes at the end, which isn't strictly necessary. But, the
download also terminates at the same place for me, and you can tell for
sure whether it's complete by looking at the tail end of the file. PDF
files are binary files, but it includes a lot of text, and the tail end
is always a series of lines, ending with the line "%%EOF". This PDF file
ends with that line, so it's definitely complete.
But: what is it you would expect Wget to do? If the PDF is actually 73
bytes smaller than the server claimed, then the server has lied to Wget
about the size. There being no other way to tell if a file is completely
downloaded, what would you have Wget do instead? It's designed to keep
trying until it gets the whole file: that's what it does.
Note that the second attempt shows that the server doesn't support
partial retrieval (it issues a 200 OK, rather than 206 Partial Content),
so if the download fails, Wget is forced to re-download from scratch. :\
** Changed in: wget (Ubuntu)
Status: New => Invalid
--
wget gives incorrect info. even when it has downloaded the whole file.
https://bugs.launchpad.net/bugs/209704
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs