I am trying to delete cache and I can't find it's location. It's not in .scrapy directory because: 1.I deleted .scrapy folder 2.I created a blank scrapy project so there is absolutelly no way the cache is coming from the .scrapy folder
This method: $ scrapy shell -s HTTPCACHE_ENABLED=True 2017-02-28 10:41:41 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: httpbin) (...) >>> from scrapy.utils.misc import load_object >>> storage = load_object(settings['HTTPCACHE_STORAGE'])(settings) >>> storage.cachedir '/home/paul/scrapy/httpbin/.scrapy/httpcache' only gives me the same location inside .scrapy folder. I am 100% positive data is taken from a cache somewhere because: 1.it takes 15 seconds to run the spider (as opposed to 10-15 hours) 2.I stopped the internet and the spider continues to get data 3.requests have the cached flag 2017-02-28 10:47:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://www.example.com> (referer: None) ['cached'] I tried to use a windows tool (process monitor) to see which files are accessed. I see a lot of files created in the blank .scrapy folder that I just created but no other large amounts of files being read from anywhere else. So the only sane explanation would be that Scrapy has a database which is a single file and just reads from it (so this is why I don't see lots of cache files being read because scrapy's source is a single database file) So my question is : is there such a thing as a default Scrapy database where Scrapy keeps cache ? If not then from where is my cache magically reapearing back ? -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.