Hi Hakim, I'm not sure how you get this "instance" with attributes related to errors. and you catching these through an errback?
You can get non-200 responses via HttpError middleware (enabled by default) and by defining an handle_httpstatus_list attribute to your spider Example: from scrapy.spider import Spider class ErrorSpider(Spider): name = "testerror" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/", "http://www.dmoz.org/rererere/", ] handle_httpstatus_list = [404] def parse(self, response): self.log("type: %s; status %d" % (type(response), response.status)) On Tuesday, April 15, 2014 4:51:23 PM UTC+2, Hakim Benoudjit wrote: > > hi guys, > > I have a little issue with reponse object inside a request callback when > the page returns a 404: > - If the page exists (http code:* 200*) response is of type > *HtmlResponse*. > - If the page returns 404, response is of type *instance *which > contain some attriubtes related to error messages, and in this latter case, > *status > *isnt an attriburte of the *response *object. > > so I can know if the response *status *is *404*, only if I verify *response > *object class (*HtmlResponse or **instance *). > > how do we know that a page returns *404 *if *response.status *isnt > available as an attribute of *reponse *object ? > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.