I am trying to extract product info from a web site using my scarpy spider, what is best approach to void crawling data which already extracted from the previous run the spider , as per my research I can use either middleware or item pipeline, hope if I can get any help
For example I run the spider now and I get 30 items, in the second run for the spider I do not want get these items i want to get just the new item which not extracted yet Thanks -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/groups/opt_out.
