On Jan 6, 2011, at 3:21am, Sam Crawford wrote:

This is what the HEAD request method is for - it gets the headers
without the body. For example:

[snip]

One caveat - there are servers out there that will fail when you make the HEAD request, but work with the GET.

We ran into this while trying to efficiently handle link shorteners during web crawls.

I would suggest retrying with a GET if a HEAD request fails.

-- Ken

# curl -I -v http://www.google.com
* About to connect() to www.google.com port 80
*   Trying 74.125.45.105... connected
* Connected to www.google.com (74.125.45.105) port 80
HEAD / HTTP/1.1
User-Agent: curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
Host: www.google.com
Accept: */*

< HTTP/1.1 200 OK
< Date: Thu, 06 Jan 2011 11:20:57 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
< Set-Cookie: PREF = ID =6fa2fec0f809889b:FF=0:TM=1294312857:LM=1294312857:S=7WeuO7OI73SOSQdT;
expires=Sat, 05-Jan-2013 11:20:57 GMT; path=/; domain=.google.com
< Server: gws
< X-XSS-Protection: 1; mode=block
< Transfer-Encoding: chunked

So rather than using the GET method (probably defined earlier in your
code where you instantiate "method"), you should be able to just
change it to HEAD.

Thanks,

Sam


On 6 January 2011 11:11, Hannes Carl Meyer <[email protected]> wrote:
Hi,

I have a big list of URLs and the only data I need to fetch are the server
response headers for each URL.
What is the fastet way to get this data while NOT downloading the actual
content?

My current code:

client.executeMethod(method);
Header[] headers = method.getResponseHeaders();
(do something with the header data)
method.releaseConnection();

The code is actually doing fine, I just wonder if there are other - more effective - ways to retrieve server response headers for a particular url.

Thanks

Hannes


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to