Howdy all, I'm hoping that the problem I currently have is one already solved, either in the Python standard library, or with some well-tested obvious code.
A program I'm working on needs to access a set of files locally; they're just normal files. But those files are local cached copies of documents available at remote URLs — each file has a canonical URL for that file's content. I'd like an API for ‘get_file_from_cache’ that looks something like:: file_urls = { "foo.txt": "http://example.org/spam/", "bar.data": "https://example.net/beans/flonk.xml", } for (filename, url) in file_urls.items(): infile = get_file_from_cache(filename, canonical=url) do_stuff_with(infile.read()) * If the local file's modification timestamp is not significantly earlier than the Last-Modified timestamp for the document at the corresponding URL, ‘get_file_from_cache’ just returns the file object without changing the file. * The local file might be out of date (its modification timestamp may be significantly older than the Last-Modified timestamp from the corresponding URL). In that case, ‘get_file_from_cache’ should first read the document's contents into the file, then return the file object. * The local file may not yet exist. In that case, ‘get_file_from_cache’ should first read the document content from the corresponding URL, create the local file, and then return the file object. * The remote URL may not be available for some reason. In that case, ‘get_file_from_cache’ should simply return the file object, or if that can't be done, raise an error. So this is something similar to an HTTP object cache. Except where those are usually URL-focussed with the local files a hidden implementation detail, I want an API that focusses on the local files, with the remote requests a hidden implementation detail. Does anything like this exist in the Python library, or as simple code using it? With or without the specifics of HTTP and URLs, is there some generic caching recipe already implemented with the standard library? This local file cache (ignoring the spcifics of URLs and network access) seems like exactly the kind of thing that is easy to get wrong in countless ways, and so should have a single obvious implementation available. Am I in luck? What do you advise? -- \ “If consumers even know there's a DRM, what it is, and how it | `\ works, we've already failed.” —Peter Lee, Disney corporation, | _o__) 2005 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list