Hi,

I think that the problem is that you use sel.xpath inside the "for site in
sites".
you should use site.xpath inside the for (but I hardly remeber if this is
right).

or try the following xpath
//div[@id="search-results"]/table/tbody/tr

each result will iterate in each entity or corporation.


Good Luck!
Leo


On Tue, Apr 1, 2014 at 8:40 PM, Dc1981 <da...@colossalpoint.com> wrote:

> Hello,
>
> I am trying to scrape a website for practice and basically what I am
> trying to accomplish is to pull all the companies that are active and
> download them to a CSV file. You can see my code pasted below. I am not
> sure what I am doing wrong.
>
> Also I think the spider is crawling the website multiple times based on
> its output. I only want it to crawl the site once every time I run it.
>
> from scrapy.spider import Spider
> from scrapy.selector import Selector
> from bizzy.items import BizzyItem
>
> class SunSpider(Spider):
>     name = "Sun"
>     allowed_domains = ['sunbiz.org']
>     start_urls = [
>         '
> http://search.sunbiz.org/Inquiry/CorporationSearch/SearchResults/EntityName/a/Page1
> '
>     ]
>
>
>     def parse(self, response):
>         sel = Selector(response)
>         sites = sel.xpath('//tbody/tr')
>         items = []
>         for site in sites:
>             item = BizzyItem()
>             item["company"] = sel.xpath('//td[1]/a/text()').extract()
>             item["status"] = sel.xpath('//td[3]/text()').extract()
>             if item["status"] != 'Active':
>                 pass
>             else:
>                 items.append(item)
>         return items
>
>  --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scrapy-users+unsubscr...@googlegroups.com.
> To post to this group, send email to scrapy-users@googlegroups.com.
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to