I am trying to delete cache and I can't find it's location.
It's not in .scrapy directory because:
1.I deleted .scrapy folder
2.I created a blank scrapy project
so there is absolutelly no way the cache is coming from the .scrapy folder

This method:

$ scrapy shell -s HTTPCACHE_ENABLED=True
2017-02-28 10:41:41 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: httpbin)
(...)
>>> from scrapy.utils.misc import load_object
>>> storage = load_object(settings['HTTPCACHE_STORAGE'])(settings)
>>> storage.cachedir
'/home/paul/scrapy/httpbin/.scrapy/httpcache'

only gives me the same location inside .scrapy folder.

I am 100% positive data is taken from a cache somewhere because:
1.it takes 15 seconds to run the spider (as opposed to 10-15 hours)
2.I stopped the internet and the spider continues to get data
3.requests have the cached flag 

2017-02-28 10:47:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET 
http://www.example.com> (referer: None) ['cached']


I tried to use a windows tool (process monitor) to see which files are 
accessed.
I see a lot of files created in the blank .scrapy folder that I just 
created but no other large amounts of files being read from anywhere else.
So the only sane explanation would be that Scrapy has a database which is a 
single file and just reads from it (so this is why I don't see lots of 
cache files being read because scrapy's source is a single database file)

 So my question is : is there such a thing as a default Scrapy database 
where Scrapy keeps cache ?
 If not then from where is my cache magically reapearing back ?

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to