Pause the first parse function when a new request is yielded

Χρατς Χρουτς Thu, 25 May 2017 12:42:53 -0700

Hello. I'm writing a spider for a website that gathers some hyperlinks, 
then visits them and checks if something exists and returns the results 
into a text file.
I have a for loop that yields requests, calling a parse2 function that 
checks the link and updates the text file.



     evenselectorlist = response.css('table[id="result_table"] tr.even')
    for evenselector in evenselectorlist:
    relative = 
evenselector.css('a[title="Link"]::attr(href)').extract_first()
    yield scrapy.Request(response.urljoin(relative), 
callback=self.parse2,meta={'item':item},dont_filter=True)


         def parse2(self, response):
                  #txt file stuff


Is there a way to make the first parse function pause when the request is 
yielded? I would like to continue to do some stuff AFTER the new requests 
have ended.
For example, I'd like to have a counter to see how many links have the 
information I want, which is available only after all the links have been 
visited. 
I hope you understand what I'm trying to say. Thank you!

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Pause the first parse function when a new request is yielded

Reply via email to