On Tuesday, 14 February 2017 06:23:28 UTC+11, Алексей Боровиков wrote: > > Have some problems with scrapy. I'm the beginner and need parse data from ' > http://www.allworthhomes.com.au'. Need to go to 'our home' category and > parse items. Trying to do a spider but have problems with xpaths and > rules(i think). It works without any mistakes but does not collect the > data. My code: > > from scrapy.contrib.spiders import CrawlSpider, Rulefrom > scrapy.contrib.linkextractors.sgml import SgmlLinkExtractorfrom > scrapy.contrib.loader.processor import TakeFirstfrom scrapy.contrib.loader > import XPathItemLoaderfrom scrapy.selector import HtmlXPathSelectorfrom > borik.items import BorikItem > class BorikLoader(XPathItemLoader): > > default_output_processor = TakeFirst() > class domaSpider(CrawlSpider): > name = "doma" > allowed_domains = ["http://www.allworthhomes.com.au"] > start_urls = ["http://www.allworthhomes.com.au/our-homes"] > rules = ( > Rule(SgmlLinkExtractor(allow=('allworthhomes.com.au\+')), > callback='parse_item'), > ) > def parse_item(self, response): > hxs = HtmlXPathSelector(response) > > l = BorikLoader(BorikItem(), hxs) > > l.add_xpath('name', > '//*[@id="node-home-design-overview-details"]/div[4]/div[1]/text()') > > return l.load_item() > > Give some wise advice, please >
What is your expected output? Not finding your element in the page view-source:http://allworthhomes.com.au/our-homes if you could show content expected may be able to help Sayth -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.