Retrieve HTTP Status code from crawl

Tim Fletcher Mon, 21 Nov 2011 08:43:57 -0800

Hi All,

I'm trying to get the status code associated with each page. But can't find
a way to do this


I have tried getting the status CrawlDatum.PARSE_DIR_NAME however this
gives me values such as "Status: 67 (linked)"

Also, it is possible to extract data regarding things like 301-302
redirects? For example i would like to trace the redirect path from page1
to page 2 (i.e. all the intermediary pages followed)

Any help on how to get the "raw" HTTP status codes would be
much appreciated.

Regards,
Tim

Retrieve HTTP Status code from crawl

Reply via email to