Hi Guys, I am trying scrapy a website, however the problem is whenever I try to visit the page from which I have to scrap data it redirects to some other page. if I visit that page manually in the the browser it's not being redirected anyway, I checked the response code as well, it shows 200.
However with scrapy it's being redirected and I am able to see the code 302. Following is the website I am trying to scrap. http://www.lonmark.org/membership/directory/partners In the scrapy logs I am able to see following entries. 2015-03-05 15:08:36+0530 [lonamrk] DEBUG: Redirecting (302) to <GET http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/membership/directory/partners> 2015-03-05 15:08:37+0530 [lonamrk] DEBUG: Redirecting (302) to <GET http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap> 2015-03-05 15:08:37+0530 [lonamrk] DEBUG: Redirecting (302) to <GET http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap> 2015-03-05 15:08:41+0530 [lonamrk] DEBUG: Redirecting (302) to <GET http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap> Following the code. class Spider(BaseSpider): name = "lonamrk" allowed_domains = ["lonmark.org"] # Request.meta = {'dont_redirect': True, # 'handle_httpstatus_list': [302]} start_urls = ["http://www.lonmark.org/membership/directory/partners"] def parse(self, response): print response.url hxs = HtmlXPathSelector(response) company_links = hxs.select("//*[@id='page_content']/table/tbody/tr[1]/td[1]/a/@href") for link in company_links: yield Request("http://www.lonmark.org/membership/directory/"+link._root, callback=self.parse_company_info) If I uncomment the code, and stop redirection. Then I am not getting anything in the response body. would someone please help me what to do ??? -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
