Hi, Here is what I would do in this case. (Changes in green)
resultCounts = [] maxlen = len(pages) def parse(self, response): for page in pages: yield Request(url, callback=processPage) # ALL pages visited, and all count results returned and added to resultsCount # get page with lowest count def processPage(self, response): count = response.xpath('//div[@id="count"]') resultCounts.append(count) if len(resultCounts) == maxlen: minpage = sorted(resultCounts.items(), key=lambda t: t[1])[0][0] yield Request(minpage, callback=processMinPage) def processMinPage(self, response): # do stuff PS: Though, I believe this should work, I havent tried it. Regards, Ashish On Sat, Aug 29, 2015 at 10:00 AM, Lee H. <popov.gh...@gmail.com> wrote: > I'm guessing the answer to this question is going to be 'no', but I wanted > to double check there is not something I am missing. > > Let's say you have a spider that scrapes some URLS, and you want to take > some information returned from analyzing ALL those requests BEFORE deciding > what to do next. > > I know Scrapy doesn't work like the following, but just in pseduo code to > illustrate roughly what I want to do > > *def parse(self, response):* > > * resultCounts = {}* > * for page in pages:* > * resultCounts[page] = yield Request(url, callback=processPage)* > > * # ALL pages visited, and all count results returned and added to > resultsCount* > * # get page with lowest count* > * minpage = sorted(resultCounts.items(), key=lambda t: t[1])[0][0]* > * yield Request(minpage, callback=processMinPage)* > > > > *def processPage(self, response):* > > * count = response.xpath('//div[@id="count"]')* > * return count* > > *def processMinPage(self, response):* > * # do stuff* > > Now I know the above doesn't work, the yield Request just returns a > deferred immediately, and not the count from the callback, and we get to > "minpage=..." way before any of the callback chains have finished anyway, > but I hope it illustrates the kind of thing I'd like to do in Scrapy. Is > there any way of doing this kind of thing? or would I need 2 spiders and a > python control script? > > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to scrapy-users+unsubscr...@googlegroups.com. > To post to this group, send email to scrapy-users@googlegroups.com. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.