Hi, I would like to crawl a particular set of websites every hour to detect content changes, but i'm not sure what storage method would be best for my use case. I could potentially store crawl results in json or csv files, use mongodb, or some other solution like elasticsearch (if it supports historical records). But I'm not sure which pathway is the best option. Is anyone currently storing and keeping a historical record of crawled content? If so, what strategy are you using?
-- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
