Hi, I think that the problem is that you use sel.xpath inside the "for site in sites". you should use site.xpath inside the for (but I hardly remeber if this is right).
or try the following xpath //div[@id="search-results"]/table/tbody/tr each result will iterate in each entity or corporation. Good Luck! Leo On Tue, Apr 1, 2014 at 8:40 PM, Dc1981 <da...@colossalpoint.com> wrote: > Hello, > > I am trying to scrape a website for practice and basically what I am > trying to accomplish is to pull all the companies that are active and > download them to a CSV file. You can see my code pasted below. I am not > sure what I am doing wrong. > > Also I think the spider is crawling the website multiple times based on > its output. I only want it to crawl the site once every time I run it. > > from scrapy.spider import Spider > from scrapy.selector import Selector > from bizzy.items import BizzyItem > > class SunSpider(Spider): > name = "Sun" > allowed_domains = ['sunbiz.org'] > start_urls = [ > ' > http://search.sunbiz.org/Inquiry/CorporationSearch/SearchResults/EntityName/a/Page1 > ' > ] > > > def parse(self, response): > sel = Selector(response) > sites = sel.xpath('//tbody/tr') > items = [] > for site in sites: > item = BizzyItem() > item["company"] = sel.xpath('//td[1]/a/text()').extract() > item["status"] = sel.xpath('//td[3]/text()').extract() > if item["status"] != 'Active': > pass > else: > items.append(item) > return items > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to scrapy-users+unsubscr...@googlegroups.com. > To post to this group, send email to scrapy-users@googlegroups.com. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.