Hi,

I would like to crawl a particular set of websites every hour to detect 
content changes, but i'm not sure what storage method would be best for my 
use case.  I could potentially store crawl results in json or csv files, 
use mongodb, or some other solution like elasticsearch (if it supports 
historical records).  But I'm not sure which pathway is the best option.  
Is anyone currently storing and keeping a historical record of crawled 
content?  If so, what strategy are you using?

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to