Zitat von Jochen Roderburg <[email protected]>:
Zitat von Jochen Roderburg <[email protected]>:
This is really an "interesting" problem:
http://socds.huduser.org/permits/output_monthly_csv.odb?outpref=csv&geoval=state&datatype=monthlyF&varlist=1%232%233&yearlist=2000%232001%232002%232003%232004%232005%232006%232007%232008%232009%232010&statelist=13%2337%2345&msalist=+&cbsalist=+&bppllist=+&cntylist=13033%2313073%2313189%2313245%2337007%2337025%2337071%2337119%2337179%2345001%2345003%2345005%2345007%2345009%2345011%2345013%2345015%2345017%2345019%2345021%2345023%2345025%2345027%2345029%2345031%2345033%2345035%2345037%2345039%2345041%2345043%2345045%2345047%2345049%2345051%2345053%2345055%2345057%2345059%2345061%2345063%2345067%2345069%2345065%2345071%2345073%2345075%2345077%2345079%2345081%2345083%2345085%2345087%2345089%2345091&COUNTYSUM=YES&COUNTYALL=+&COUNTYGRP=+&STATESUM=+&STATEALL=+&METROSUM=+&METROALL=+&METRO=+&CBSA=+&PLACEGRP=+&CSUMNAME=&JSUMNAME=+&geo=state&chron=monthlyF
On Windows you may see older versions of wget give the error message
"Result too large" but it means filename too long. In Linux "File name too
long". And wget 1.13 --trust-server-names doesn't work with this site's
response.. should it?
Well, in theory it should work with "--content-disposition=on", as
the webapplication sends a Content-Disposition header with a
filename:
---response begin---
HTTP/1.1 200 OK
Content-Type: application/vnd.ms-excel
Server: Microsoft-IIS/6.0
Content-Disposition: attachment; filename=BuildingPermits.csv;
X-Powered-By: ASP.NET
Date: Sat, 17 Sep 2011 05:58:06 GMT
Connection: close
---response end---
... but wget seems to bail out with the overlong filename *before*
it reads the response headers.
After further examination I must retract the "before" assumption.
Debug outputs show the GET response headers with Content-Disposition
and the error message comes after it, so it looks more as if for
some unknown reason the Content-Disposition is simply ignored.
Sorry for the noise, as often the whole truth is more complicated and
one has to test very carefully to avoid all side-effects.
New result: it works fine as expected with wget default options and
--content-disposition=on
It does not work, however, with the additional option --timestamping
(makes no sense of course for this type of dynamically generated
output, but I have it as my default and somehow it seems to have also
crept into my tests, although I tried to avoid it ;-).
FWIW, in this case I see the following sequence in the debug output:
wget does a HEAD request first and gets a "standard" response
*without* Content-Disposition.
Then it makes a GET and gets the Content-Disposition.
And in this situation it seems to ignore this.
Best regards,
Jochen Roderburg