If I dont find "No Information" in response.body I want to write the good urls to a file. I am struggling to build the filter. Also, maybe there is a better way of storing the good urls and then crawling back through them once the raw_urls have been 'filtered'? def start_requests(self):
raw_urls = generate_result_urls(self.YEAR, self.YEARS) for url in raw_urls: yield scrapy.Request(url=url, callback=self.parse) def parse(self, response): #generate good urls # gors to pages, if not info write url to file f_ = 'goodurls.txt' #look for if b"No Information." not in response.body: print(response.url) yield else: #write response.url to file here -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.