Hi Guys, 

I am trying scrapy a website, however the problem is whenever I try to 
visit the page from which I have to scrap data it redirects to some other 
page. if I visit that page manually in the the browser it's not being 
redirected anyway, I checked the response code as well, it shows 200. 

However with scrapy it's being redirected and I am able to see the code 
302. 

Following is the website I am trying to scrap. 
http://www.lonmark.org/membership/directory/partners

In the scrapy logs I am able to see following entries.
2015-03-05 15:08:36+0530 [lonamrk] DEBUG: Redirecting (302) to <GET 
http://www.lonmark.org/sitemap> from <GET 
http://www.lonmark.org/membership/directory/partners>
2015-03-05 15:08:37+0530 [lonamrk] DEBUG: Redirecting (302) to <GET 
http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap>
2015-03-05 15:08:37+0530 [lonamrk] DEBUG: Redirecting (302) to <GET 
http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap>
2015-03-05 15:08:41+0530 [lonamrk] DEBUG: Redirecting (302) to <GET 
http://www.lonmark.org/sitemap> from <GET http://www.lonmark.org/sitemap>

Following the code. 
class Spider(BaseSpider):
    name = "lonamrk"
    allowed_domains = ["lonmark.org"]
    # Request.meta = {'dont_redirect': True,
    #                 'handle_httpstatus_list': [302]}

    start_urls = ["http://www.lonmark.org/membership/directory/partners";]

    def parse(self, response):
        print response.url
        hxs = HtmlXPathSelector(response)
        company_links = 
hxs.select("//*[@id='page_content']/table/tbody/tr[1]/td[1]/a/@href")
        for link in company_links:
            yield 
Request("http://www.lonmark.org/membership/directory/"+link._root, 
callback=self.parse_company_info)



If I uncomment the code, and stop redirection. Then I am not getting 
anything in the response body. 

would someone please help me what to do ???

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to